Uploaded by José Daniel Madrigal

weinberg-notes

advertisement
Notes on Weinberg’s Quantum Theory of Fields
Volume I: Fundamentals
Jimmy Qin
Summer 2019
A theorist today is hardly considered respectable if he or she has not introduced at least one new
particle for which there is no experimental evidence.
It is with Isaac Newton that the modern dream of a final theory really begins.
I managed to get a quick PhD, though when I got it I knew almost nothing about physics. But I
did learn one big thing: that no one knows everything, and you don’t have to.
These are notes on Weinberg’s text The Quantum Theory of Fields. I read it after reading through
almost all of Schwartz’ Quantum Field Theory. Volume I is on theoretical foundations, QED, and
loop calculations; volume II is on RG, non-Abelian gauge theory, and symmetry breaking; volume
III is on supersymmetry. Besides being the definitive textbook for physicists and mathematicians
on quantum field theory, Weinberg’s text is also a (perhaps the) cornerstone of QFT studies in
the philosophy of science.
These notes are written in question-answer form instead of the usual narrative form. This is
because Weinberg is a hard book, and I think it is best to digest it in small bites. Also included
are some general questions and answers which are not in Weinberg but are good to know, and
my solutions to some of the interesting problems. In these notes, I will try to understand the big
picture and the physics behind the ideas, so many of these questions will be elementary. I will try
to understand the material with as little math as possible. My emphasis will be on giving physical
arguments and intuition rather than rigor.
Question 1. Why is relativistic quantum field theory so hard?
Answer 1. Relativistic quantum field theory, the subject of this three-volume series, is very
difficult. In Novum Organon, the philosopher Francis Bacon postulated that science proceeds by
developing axioms. Experiment leads to axioms, and the results of those axioms are benchmarked
against further experiment.
• Classical mechanics has three axioms: Newton’s laws.
• Classical electrodynamics has four axioms: Maxwell’s equations.
• Special relativity has two axioms: Einstein’s postulates.
1
Jimmy Qin
Notes on Weinberg’s QFT
• Nonrelativistic quantum mechanics has five or so axioms: Born interpretation of matrix
elements, Schrodinger equation, etc.
• General relativity has one axiom: Einstein’s equation.
Actually, all of the above subjects have more axioms implicitly assumed. To solve a problem in
the above subjects, you just apply the axioms. However, you would be hard-pressed to state the
“axioms of quantum field theory.” Attempts at formulating such an axiomatic development (i.e.
by Wightman) have fallen short of encompassing all of quantum field theory. Besides, they are
very far removed from the physics.
While it is true that the axioms of special relativity and quantum mechanics are imported into
quantum field theory, we are missing some other axioms which would tell us how to construct
the theory. For example, the idea of wavefunction must be modified from quantum mechanics to
quantum field theory, and there is no axiom for that. There are only very general principles, like
energy conservation and cluster decomposition. Showing that these principles in fact constrain
the space of reasonable theories to QFT as we know it today is the thesis of Weinberg’s book.
Contents
1 History
5
2 Relativistic Quantum Mechanics
5
2.1
Wigner’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.2
Poincare algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.3
Wigner little group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.4
Discrete symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
2.5
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
3 Scattering Theory
16
3.1
S-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
3.2
Symmetries of the S-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
3.3
Rates and cross-sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
3.4
Implications of unitarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
3.5
Resonances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
3.6
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
4 The Cluster Decomposition Principle
30
4.1
Operator decomposition in a†q and aq . . . . . . . . . . . . . . . . . . . . . . . . .
30
4.2
Factorization of the S-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
2
Jimmy Qin
Notes on Weinberg’s QFT
4.3
Which Hamiltonians satisfy cluster decomposition? . . . . . . . . . . . . . . . . .
35
4.4
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
5 Quantum Fields and Antiparticles
37
5.1
General free fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
5.2
Causal scalar fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
5.3
Causal vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
5.4
Spinor representation of Lorentz group . . . . . . . . . . . . . . . . . . . . . . . .
43
5.5
Causal Dirac field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
5.6
General irreps of the Lorentz group . . . . . . . . . . . . . . . . . . . . . . . . . .
48
5.7
CPT theorem; spin-statistics theorem . . . . . . . . . . . . . . . . . . . . . . . . .
50
5.8
Massless fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
5.9
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
6 Feynman Rules
53
6.1
Dyson’s derivation in position space . . . . . . . . . . . . . . . . . . . . . . . . . .
53
6.2
Propagators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
6.3
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
7 The Canonical Formalism
59
7.1
Canonical variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
7.2
Lagrangian formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
7.3
Global symmetries and Noether theorem . . . . . . . . . . . . . . . . . . . . . . .
64
7.4
Lorentz invariance of L implies Lorentz invariance of S-matrix . . . . . . . . . . .
65
7.5
Examples of canonical quantization and transition to interaction picture . . . . . .
66
7.6
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
8 Electrodynamics
68
8.1
Gauge invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
8.2
Difficulties with quantizing the photon field . . . . . . . . . . . . . . . . . . . . .
71
8.3
QED in the interaction picture; photon propagator . . . . . . . . . . . . . . . . .
72
8.4
p-form gauge fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
8.5
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75
9 Path Integrals
75
3
Jimmy Qin
Notes on Weinberg’s QFT
9.1
Hamiltonian version of path-integral formula; S-matrix . . . . . . . . . . . . . . .
75
9.2
Lagrangian version of path-integral formula
. . . . . . . . . . . . . . . . . . . . .
76
9.3
Path-integral derivation of Feynman rules
. . . . . . . . . . . . . . . . . . . . . .
79
9.4
Path-integral formulation of QED . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
9.5
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
10 Non-perturbative methods
83
10.1 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
10.2 Polology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
10.3 Field and mass renormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
10.4 Renormalized charge; Ward-Takahashi identity . . . . . . . . . . . . . . . . . . . .
91
10.5 Gauge invariance of the S-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
10.6 More on the LSZ reduction formula . . . . . . . . . . . . . . . . . . . . . . . . . .
96
11 One-loop radiative corrections in QED
97
11.1 Counterterms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
11.2 Vacuum polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
11.3 Interlude: Form factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
11.4 Anomalous magnetic moment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
11.5 Electron propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
11.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
12 General renormalization theory
106
12.1 Degrees of divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
12.2 Cancellation of divergences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
12.3 Is renormalizability necessary? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
12.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
13 Infrared effects
112
13.1 Soft photon amplitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
13.2 Virtual soft photons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
13.3 Cancellation of divergences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4
Jimmy Qin
Notes on Weinberg’s QFT
1
History
2
Relativistic Quantum Mechanics
Question 2. What is the difference between relativistic quantum mechanics and quantum field
theory?
Answer 2. Relativistic quantum mechanics is still quantum mechanics, in the sense that nothing
has been second-quantized yet. It is the study of how wavefunctions Ψ behave under relativistic transforms, such as Lorentz boosts and rotations, and how they behave under more general
(anti)unitaries, such as P CT . Quantum field theory must be consistent with relativistic quantum
mechanics, but now the wavefunctions are second-quantized, and particle number is no longer
conserved.
Just like special relativity is the study of how objects appear to observers in different frames,
relativistic quantum mechanics is the study of how wavefunctions appear to observers in different
frames. In fact, the normalization N (p) of a wavefunction under Lorentz transform (see subsection
on Wigner little group) can be interepreted as a length-contraction correction to the normalization
of the wavefunction.
2.1
Wigner’s theorem
Question 3. How is relativistic quantum mechanics consistent between observers?
Answer 3. The physically distinguishable quantities in a Hilbert space are called rays: equivalence classes of vectors (wavefunctions) that differ by only a phase. If there are two observers A
and B, RQM is consistent if
|hR A |RnA i|2 = |hR B |RnB i|2 ,
where R denotes a ray and the superscript denotes which observer is used. This is sloppy notation
and I really can only take the inner product of wavefunctions which belong to the rays indicated
above. Wigner proved that any transformation between A and B observers,
U (R A ) = R B ,
must be either unitary or antiunitary. The easiest way to think about this is that only these
transformations conserve probability. A unitary U and antiunitary A are linear and antilinear,
respectively, and satisfy
hΦ|U † Ψi = hU Φ|Ψi, hΦ|A† Ψi = hΨ|U Φi.
Question 4. Prove Wigner’s unitary-antiunitary theorem.
Answer 4. The idea of Wigner’s theorem (conservation of probability) is extremely simple. However, there are some awkward steps involved in going between rays and state vectors.
Let observer A be denoted by no prime and observer B be denoted by a prime, 0 , and call T the
transformation between A’s rays and B’s rays. We demand
|hΨ1 |Ψ2 i|2 = |hΨ01 |Ψ02 i|2 for all Ψ1 ∈ R1 , Ψ2 ∈ R2 , Ψ01 ∈ T R1 , Ψ02 ∈ T R2 .
5
Jimmy Qin
Notes on Weinberg’s QFT
Suppose |Ψk i is an orthonormal basis of state vectors in A’s frame. If the postulates of quantum
mechanics are valid in B’s frame and the assumption above holds, then |Ψ0k i is an orthonormal
basis of state vectors in B’s frame.
Now we must specify how T , which acts on rays, can turn into an action UPon state vectorsP|Ψi. Let
|Ψ0k i = U |Ψk i. From the above, we find that for general expansions Ψ = k Ck Ψk , Ψ0k = k Ck0 Ψ0k ,
we have (for example, |h √12 (Ψk + Ψl )|Ψi|2 = |h √12 (Ψ0k + Ψ0l )|Ψ0 i|2 )
0 2
| , · · · for all l, m, n · · ·
|Ck |2 = |Ck0 |2 , |Ck + Cl |2 = |Ck0 + Cl0 |2 , |Ck + Cl + Cm |2 = |Ck0 + Cl0 + Cm
The first two equalities give
Re(
Ck
C0
Ck
C0
Ck
C0
Ck
C0
) = Re( k0 ), Im( ) = ±Im( k0 ) =⇒
= k0 or
= ( k0 )∗ .
Cl
Cl
Cl
Cl
Cl
Cl
Cl
Cl
The first case is unitary and the second is antiunitary. It is not hard to show that the operator
C0
C0
U cannot be a hybrid of unitary CCkl = Ck0 and antiunitary CCkl = ( Ck0 )∗ parts; this would violate
l
l
0 2
| . Combining this with |Ck |2 = |Ck0 |2 gives Ck0 = ζCk or Ck0 = ζCk∗ ,
|Ck +Cl +Cm |2 = |Ck0 +Cl0 +Cm
where ζ is a phase factor independent of k. The phase ζ can be eliminated by rotating each state
vector Ψk by ζ. Effectively, ζ is the extra freedom afforded to rays over state vectors.
Finally, we can show U is unitary in the first case and antiunitary in the second case by showing,
for example, hU Ψ|U Φi = hΨ|Φi or hΦ|Ψi.
Unsurprisingly, this logic can be generalized. See https://arxiv.org/abs/1810.10111.
2.2
Poincare algebra
Doing manipulations on the Poincare algebra is good practice for any kind of algebra, so I will
study this example closely.
Question 5. Why is the generator ωµν antisymmetric?
Answer 5. For any Lie group with transformations Λµν that preserves a metric, we must have
g µσ = Λµν Λσρ gνρ .
Substituting the infinitesimal transformations Λµν = δ µν + ω µν gives ω µν + ω νµ = 0.
This holds for SO(n) and SO(n, m), generally. It holds whenever distances are preserved. Another,
equivalent way to think about it is that
vµ0 = gµσ Λσν vν =⇒ v 0µ vµ0 = gµσ g µπ Λπρ Λσν v ρ vν .
For distances to be preserved, we would like
gµσ g µπ Λπρ Λσν = Λµρ Λνµ = δνρ .
Expanding Λµρ = δρµ + ωρµ around the identity gives the antisymmetry relation.
Yet another way to think of it is: let Q(t) be an orthogonal matrix function, QT Q = I, and
suppose Q(t = 0) = I. Evaluating the derivative at t = 0, we find
Q̇T Q + QT Q̇ = 0 =⇒ Q̇T = −Q̇ =⇒ ω T = −ω,
where ω is a generator of the Lie algebra.
6
Jimmy Qin
Notes on Weinberg’s QFT
Question 6. Describe the difference between Lorentz transformation (or Poincare algebra) and
the unitary that implements the transformation on a quantum field.
Answer 6. Infinitesimally, the difference is
i
Λµν = δνµ + ωνµ versus U (1 + ω) = 1 + ωρσ J ρσ .
2
Λ acts on x or p and U acts on |Ψi. Generally, there exists a unitary U (Λ, a) that describes how
the wavefunction changes under the composition of a Poincare transformation Λ and a translation
a. Here are some distinctions:
• ωµν is a matrix acting on 4-vectors (in this sense, it is a matrix). However, Jµν is the operator
acting on wavefunctions Ψ. This is because U acts on a Fock space, so it manifestly must
be an operator. Λ does not act on the Fock space.
• ωµν is a scalar (well, a matrix made of scalars) and describes the magnitude of rotations and
boosts. It is the generalization to 4 × 4 matrices of the 3 × 3 SO(3) algebra. Jµν is not a
scalar. It is an operator, though its eigenvalues have physical meaning (for example, angular
momentum or energy).
• However, both ω and J are antisymmetric.
• ω doesn’t care about what kind of particle we are acting on, but J does. For example, the
SO(1,3) algebra described by ω describes rotations, and
U ∼ e−iSθ ,
where S is the spin. So clearly, the spin is encoded in J. Even though different particles are
described by different kinds of J, the J’s for any particle always satisfy the same commutation
relations, as described below.
Question 7. Construct the infinitesimal unitary operator that transforms wavefunctions under
the Poincare group. Derive the algebra for the generators of this unitary.
Answer 7. Lorentz boosts are characterized by the “angles” ωµν of the rotation matrix and translations are merely characterized by ρ , the shift. To leading order, the change of a wavefunction
under Lorentz boost and translation must be linear in the angles and displacements,
i
U (1 + ω, ) = 1 + ωρσ J ρσ − iρ P ρ + · · · .
2
ρσ
ρ
If U is unitary, then J and P are Hermitian. By the composition rule U (Λ2 , a2 )U (Λ1 , a1 ) =
U (Λ1 Λ2 , Λ2 a1 + a2 ), we find that
U (Λ, a)U (1 + ω, )U −1 (Λ, a) = U (Λ(1 + ω)Λ−1 , Λ − ΛωΛ−1 a).
Taking Λ, a to themselves be infinitesimal gives the Lie algebra of the Poincare group. This is
just like how commutators pop up in response functions, χ ∼ [A(t), A(0)], because there is an
e−iHt A(0)eiHt structure, which becomes a commutator.
Conclusion: if you want to find the algebra of a Lie group and know the composition of the underlying spatial transformations (for example, R12 = R1 R2 for rotations), use U, U −1 as bread to
make a sandwich with Ũ as the filling. This will generate a commutator in the case where both U
and Ũ are infinitesimal. An important point is that the structure of the transformations on the
Fock space (Lie algebra of the Poincare group) depends intimately on the structure of the transformations on the underlying physical space (compositions of Lorentz boosts and translations).
7
Jimmy Qin
2.3
Notes on Weinberg’s QFT
Wigner little group
Question 8. Why is Wigner little group useful? Why do we need it?
Answer 8. Typically it is most convenient for us to describe particles with momentum (and spin)
eigenstates, as in Ψpσ . These are eigenvectors of the momentum operator. The problem is that
Ψpσ is not an eigenvector of the Lorentz group (boosts and rotations).
The Wigner little group preserves momentum but not spin projection, although it does preserve
total spin of a particle. It will turn out to be useful later.
Question 9. What is Wigner’s classification?
Answer 9. Wigner’s classification uses two Casimirs (P 2 and W 2 , where P µ is the 4-momentum
and W µ is the Pauli-Lubjanski pseudovector) to classify the irreducible unitary representations of
the Lorentz group. These represent particles (or one-particle states) and are infinite-dimensional.
See https://en.wikipedia.org/wiki/PauliLubanski_pseudovector.
However, later we will embed these infinite-dimensional unitary representations into finite-dimensional
nonunitary representations.
Question 10. How do one-particle states Ψpσ transform under the Lorentz group?
Answer 10. Recall that the Lorentz group is basically a glorified rotation group. What we are
looking for here are essentially glorified Clebsch-Gordan coefficients hΛp, σ 0 |pσi,
X
U (Λ)|pσi =
hΛp, σ 0 |pσi|Λp, σ 0 i.
σ0
So generally, states labeled with momentum and spin will go to a certain momentum, but their
spin will split. The magnitude of the spin vector, however, is preserved.
Question 11. What is the method of induced representations?
Answer 11. The Clebsch-Gordan coefficients from above are different for each representation of
the Poincare group. The claim of induced representations is: the Clebsch-Gordan coefficients for
the entire Lorentz group is uniquely determined by the representation of the little group.
This is not hard to show for the example of the Lorentz group. We would like to find the ClebschGordan coefficients Cσ0 σ (Λ, p) in the Lorentz-boost formula
X
U (Λ)Ψpσ =
Cσ0 σ (Λ, p)ΨΛp,σ0 .
σ0
Choose the reference momentum k µ , and define the states Ψpσ by
Ψpσ = N (p)U (L(p))Ψkσ .
Here, N (p) is a normalization factor and Lµν k ν = pµ . It’s not obvious why the spin is conserved in
this definition; it doesn’t make physical sense to me. In other words, it doesn’t look like a physically
reasonable definition.Let W be a Lorentz transformation in the little group, Wνµ k ν = k µ . Then
X
U (W )Ψkσ =
Dσ0 σ (W )Ψkσ0 .
σ0
8
Jimmy Qin
Notes on Weinberg’s QFT
Writing
U (Λ)Ψpσ = N (p)U (L(Λp))U (L−1 (Λp)ΛL(p))Ψkσ
and setting W (Λ, p) = L−1 (Λp)ΛL(p) gives
U (Λ)Ψpσ =
Therefore, Cσ0 σ (Λ, p) =
N (p) X
Dσσ0 (W (Λ, p))ΨΛp,σ0 .
N (Λp) σ0
N (p)
Dσ0 σ0 (W (Λ, p)) . W (Λ, p) is called the Wigner rotation.
N (Λp)
How can we think about this? Consider the little group of a massive particle, SO(3). This is
a subgroup of the full inhomogeneous Lorentz group, SO(1, 3)+ ; in particular, SO(3) encodes
the behavior of the wavefunction under rotations. The claim of induced representations is that
if I tell you how the wavefunction behaves under rotations, you will also know how it behaves
under boosts, and under combinations of boosts and rotations. This makes sense if you think of
both the little-group transformations and the Lorentz transformations of the spin-j particle Ψ as
(2j + 1) × (2j + 1) matrices, and the wavefunction as Ψp = (Ψp,−j , Ψp,−j+1 , · · · , Ψp,j )T . Clearly,
different spins have differently-sized matrices. You can also think about it like this: the Lorentz
group is split into two parts:
Lorentz group = little group ⊗ pure Lorentz boosts from k µ .
The latter is already specified by the definition Ψpσ = N (p)U (L(p))Ψkσ .
Question 12. In spirit, how is the above transformation law different for massless particles?
Answer 12. Massless particles travel at the speed of light, so we cannot ever “pass” them. Therefore, their helicity, σ, (eigenvalue of operator J3 ) is conserved. As we show below, the eigenvalues
a, b of the generators A, B of the ISO(2) little group must be equal to zero. The Wigner rotation
therefore only cares about the operator J3 , and we can write
s
(Λp)0 iσθ(Λ,p)
e
ΨΛp,σ .
U (Λ)Ψpσ =
p0
Here,
W (Λ, p) = L−1 (Λp)ΛL(p) = S(α(Λ, p)β(Λ, p))R(θ(Λ, p)).
Question 13. How does Ψpσ = N (p)U (L(p))Ψkσ encode length contraction?
Answer 13. The normalization is
s
N (p) =
k0
=
p0
r
1
.
γ
3
This makes sense if you think of N (p)2 d3 p~ as the number of particles. Because dEp~ is Lorentzinvariant, we demand that N (p)2 E be Lorentz invariant. E ∼ γ =⇒ N (p)2 ∼ γ −1 .
This is essentially a length contraction argument except we are contracting the “length” of the
p~-vector instead of the ~x-vector.
9
Jimmy Qin
Notes on Weinberg’s QFT
Question 14. We know that Λp = ΛL(p)k = L(Λp)k. So, is it true that L(Λp) = ΛL(p)?
Answer 14. Nope. We have to choose a standard Lorentz transformation convention, which is
the same for both L(p) and L(Λp). This is because the matrix problem
Λk = p
does not return a unique solution for Λ. Examples of standard Lorentz transformations are given
by eqs. (2.5.24) and (2.5.44) for massive and massless particles, respectively.
Question 15. The little groups of massive and massless particles are SO(3) and ISO(2), respectively. How can we think about them?
Answer 15. In fact, SO(3) and ISO(2) are nearly the same group. They both have three generators. As in Wigner’s original paper, a transform to lightcone coordinates shows that ISO(2)
leaves 3 degrees of freedom empty, even though the reference momentum (k, 0, 0, k) has only two
nonzero entries.
A nice way to think about them is to analogize from the regular rotation group. If we start with
SO(3) and break the symmetry along ẑ = (0, 0, 1), we
left with two degrees of freedom. If we
√ are √
2,
1/
2, 0), we are still left with two degrees
start with SO(3) and break the
symmetry
along
(1/
√
√
of freedom, even though (1/ 2, 1/ 2, 0) has only one nonzero entry.
Question 16. How is the little group of massless particle related to gauge invariance?
Answer 16. Weinberg (p. 71) shows that the unitary which encodes the little group transforms
on wavefunctions Ψ corresponding to massless particles is
U (W (θ, α, β)) = 1 + iαA + iβB + iθJ3 .
Here, A = −J 13 + J 10 , B = −J 23 + J 20 , and J 3 = J 12 . Because [A, B] = 0, the wavefunction can
be diagonalized in terms of eigenvalues of A, B. This turns out to give the U (1) gauge symmetry
of QED. “If we find one set of nonzero eigenvalues of A, B, then we find a whole continuum.”
Because [J3 , A] = iB and [J3 , B] = −iA, the operator J3 rotates the wavefunction in the AB
plane. We find that
AΨθkab = (a cos θ − b sin θ)Ψθkab , BΨθkab = (a sin θ + b cos θ)Ψθkab , where Ψθkab = U −1 (R(θ))Ψkab .
There is no such continuum of eigenstates observed in nature. So, we require the eigenvalues to
be a = b = 0. The state is merely labeled by its J3 eigenvalue, which we call σ, or the helicity.
Question 17. How does the massless particle little group show that the helicity of the photon
must be σ = ±1?
Answer 17. Main point: since ISO(2) is a non-compact Lie group, the only finite dimensional
unitary representation of ISO(2) is one-dimensional. So each photon can exist only in one helicity
eigenstate. Two photons of σ = 1 and σ = −1 can actually be thought of as two particles of
“different species,” which are related only by P symmetry, or T symmetry. (However, massive
particles of given helicity, such as a massive Weyl spinor, cannot be regarded as particles of
‘different species” because a boost past the massive spinor changes its helicity.)
10
Jimmy Qin
Notes on Weinberg’s QFT
This derivation is based on https://www.physicsforums.com/threads/reduced-density-matrices-and-l
315387/#post-2223048.
This is mostly mathematical, but hopefully we can also extract some physical intuition. Let’s first
study why the Pauli-Lubjanski pseudovector is relevant for labeling the irreps of the Lorentz group.
The punchline is that the Pauli-Lubjanski psuedovector arises from the stabilizer, or Wigner little
group.
Let the Lorentz transform on 4-vectors (such as momenta) be q 0µ = Λµν q ν , where Λµν = δνµ + ωνµ .
The Wigner little group is built from transforms such that
ωνµ q ν = 0 =⇒ ωµν = µνπη q π nη ,
where nη is an arbitrary 4-vector. A typical element of the little group for momentum qµ can
therefore be written as
µ i µνπη
Uq (n, a) = eiaµ P − 2 qπ nη Jµν .
This motivates the introduction of the Pauli-Lubjanski pseudovector, defined as
1
W µ = µνπη Jνπ Pη .
2
It satisfies the properties
0 = Wµ P µ , 0 = [Wµ , Pν ], [M µν , W η ] = iη µη W ν − iη νη W µ , [W µ , W ν ] = iµνηπ Wη Pπ .
Thus, elements of the little group for momentum q µ can be written in terms of the Pauli-Lubjanski
pseudovector as
µ
µ
Uq (n, a) = eiaµ q einµ W .
Obviously, this follows because (1) P µ acts like q µ on eigenstates |qi (2) W µ and P ν commute, so
Baker-Campbell-Hausdorff formula implies we can split the exponential. This suggests that the
inhomogeneous Lorentz algebra can be characterized by these two parts of the unitary:
µ
• The first (inhomogeneous) part is the Casimir P 2 . This describes the eiaµ q part of the
unitary. It is not really part of the homogeneous Lorentz group, because homogeneous
Lorentz group does not include translations.
µ
• The second (homogeneous) part is the Casimir W 2 . This describes the einµ W part of the
unitary. In fact, the relation [W µ , W ν ] = iµνηπ Wη Pπ = iµνηπ Wη qπ for fixed momentum q µ
implies that the components of the Pauli-Lubjanski pseudovector form a Lie algebra when
restricted to fixed momentum. This will turn out to be the Lie algebra of the Wigner little
group.
µ
Because einµ W simply rotates between different states with momentum q µ , the spin properties of
the particles are determined by the Pauli-Lubjanski pseudovector. Because we do not see particles
of infinite spin, we demand on physical grounds to study only the finite-dimensional representations
of the Wigner little group. Because the rest of the inhomogeneous Lorentz group is labeled by P 2 ,
we will study the cases P 2 = m2 > 0 and also P 2 = 0. You could raise the question of why we can
just choose to study the special momentum q1 = (m, ~0) rather than q2 = (γm, p~). The answer is
that the representation of the Lorentz group is labeled by the Casimirs P 2 and W 2 . Because q1
11
Jimmy Qin
Notes on Weinberg’s QFT
and q2 have the same value of P 2 , they give the same representation of the Lorentz group. So, we
may as well go with the simplest choice. Another way to see this is to note that substituting
1
W µ = µνπη Jνπ Pη into [W µ , W ν ] = iµνηπ Wη Pπ
2
2
gives a factor of P on both sides of the Lie bracket equation.
Now we proceed as before.
• Massive little group: Choose q µ = (m, ~0). We find that
W0 = 0, Wi = mJi =⇒ [W i , W j ] = iijk0 Wk q0 = imijk Wk .
This is the familiar SU (2) or SO(3) Lie algebra. It is compact, so there are nontrivial
finite-dimensional representations.
• Massless little group: Choose q µ = (E, 0, 0, E). The Pauli-Lubjanski pseudovector turns out
to be
W µ = (J12 , J32 + J02 , J31 + J01 , J12 ) = (J12 , E1 , E2 , J12 ).
Therefore, the Casimir is given by
W µ Wµ = E12 + E22 .
So mathematically, P 2 = 0 does not imply W 2 = 0. However, we will still want W 2 = 0
for physical reasons: if W 2 > 0, then the representation of ISO(2) is infinite-dimensional,
and the particles are interpreted to have infinite spin. (Why is the representation infinite2
2
2
dimensional
√ for W = E1 + E2 > 0? Note that we can choose E1 and E2 from a circle
of radius W 2 . Therefore, there is a continuum of eigenstates and the irrep is infinitedimensional. The irreps for SO(3) are finite-dimensional because they are labeled only by
J 2 and not J12 + J22 or something.) Therefore, in the massless case we have
µ
Uq (a, n) = eiaµ q eiθJ12 ,
and for particles with finite spin, ISO(2) is purposely “broken by hand” to U (1). If we label
the eigenvalue
J12 |q, hi = h|q, hi,
we find that h = 0, ± 21 , ±1, · · · by the θ → θ + 2π invariance. h is called the helicity
quantum number.
In fact, the relation
W µ = hP µ
is true in all reference frames, even though we derived it for a special choice of momentum.
This is because both sides are (pseudo)vectors which transform the same way under Lorentz
group.
Conclusion: it is perhaps dangerous to think of the photon as having “only two helicity eigenstates,” because the helicity eigenstate σ = 1 is from a completely different irrep than σ = −1. A
better way to say it is: the physical requirement that particles have finite spin leads us to label
massless particles by the helicity quantum numbers h = 0, ± 21 , ±1, · · · . Because the helicities
h = ±σ are related by P or T reversal, we have to group them together when doing physics. They
are not, however, considered to be “different states of the same particle,” but rather “different
particles which are intimately related.”
12
Jimmy Qin
2.4
Notes on Weinberg’s QFT
Discrete symmetries
Question 18. Describe the four parts of the Lorentz group.
Answer 18. The proper orthochronous Lorentz group is that subgroup which has det Λ = 1
and Λ00 ≥ 1. (The choices are det Λ = ±1 and (Λ00 )2 = 1 + Λ0i Λ0i .) We denote the parity and time
reversal operators on points in spacetime by P, T , respectively. We denote the induced parity
and time reversal operators on wavefunctions by P, T, respectively. In other words,
P = U (P, 0), T = U (T , 0).
The other three parts of the Lorentz group are obtained by acting on the proper orthochronous
Lorentz group with P, T , PT .
Question 19. What does it mean, mathematically, for a wavefunction to be invariant under parity
or time symmetries?
Answer 19. To a good approximation (except for parity violation and time-reversal violations,
which occur only at high energy scale), the regular composition rules for Poincare transforms are
also satisfied for these discrete symmetry transforms:
PU (Λ, a)P−1 = U (PΛP −1 , Pa), TU (Λ, a)T−1 = U (T ΛT −1 , T a).
Just as before, we can find the transformation rules for J µν , P µ under the parity and time reversal
symmetries by taking Λ, a to be infinitesimal. We decide that P must be unitary and T must be
antiunitary by observing how the energy operator H := P 0 transforms,
PiHP−1 = iH, TiHT−1 = −iH,
and demanding there be no states of negative energy.
However, energy is preserved under both P and T,
PHP−1 = H, THT−1 = H,
and the distinction is just the unitary-antiunitary difference.
Question 20. What do P and T do to massive particle wavefunctions Ψpσ ?
Answer 20. There is a nice way to do this, which uses the fact that 0 is its own negative. Let
k be the reference momentum, P~ Ψkσ = 0. 0 is invariant under P and P doesn’t change angular
momentum, so
PΨkσ = ηΨkσ ,
where η is a σ-independent phase (you can show this, for example, by hopping up and down σ
with the J ± operator), or by using the same phase argument in the proof of Wigner’s unitaryantiunitary theorem, since we know P is unitary. Because PL(p)P −1 = L(Pp), we find
PΨpσ = PN (p)U (L(p))Ψkσ = ηΨPp,σ .
This is modified for T, which flips angular momentum. In this case, T is antiunitary and the phase
is no longer σ-independent. The result follows from the surprising fact that T L(p)T −1 = L(Pp)
and is
TΨpσ = ζ(−1)j−σ ΨPp,−σ .
13
Jimmy Qin
Notes on Weinberg’s QFT
−1
~
~ and T L(β)T
~ −1 = L(P β)?
~
Question 21. How to understand PL(β)P
= L(P β)
Answer 21. I introduced the vector β so it’s easier to think about the Lorentz boost along β̂.
As shown in a previous question, energy H is preserved under both parity and time. Rather than
multiplying these relations out, you can think of them as follows. Introduce a four-vector (E, ~q)
on which the Lorentz transforms will act. Note that
~
~
L(P β)(E,
~q) = L(−β)(E,
~q).
We have
−1
~
~
PL(β)P
(E, ~q) = PL(β)(E,
−~q).
~
This means that P will reverse the sign of every unpaired vector in the result L(β)(E,
−~q). That
~
will merely give L(−β)(E, ~q). Similarly,
~
~ −1 (E, ~q) = T L(β)(E,
−~q).
T L(β)T
2.5
Problems
Question 22. Suppose that observer O sees a W boson (spin-1 and m > 0) with momentum p~
in the ŷ-direction and spin z-component σ. A second observer O0 moves relative to the first with
velocity ~v in the ẑ-direction. How does O0 describe the W state?
Answer 22. This is just like in classical mechanics, where a special relativity problem asks you
to find what person A reads on person B’s clock, or how long person B’s hot dog is. Except a lot
harder now, because there is not just one state - the answer is a superposition of different states.
We will use
s
U (Λ)Ψpσ =
(Λp)0 X
Dσσ0 (W (Λ, p))ΨΛp,σ0
p0 σ 0
and also the standard boost L(p) given in (2.5.24), along with W (Λ, p) = L−1 (Λp)ΛL(p). The
explicit form of the boost along ẑ gives
p
p
p
pµ = (0, |~p|, 0, p~2 + m2 ), (Λp)µ = (0, |~p|, −γβ p~2 + m2 , γ p~2 + m2 ).
Above, β := |~v |/c and γ := (1 − β)−1/2 . The rest of the solution is very long and probably
not worth it. Here is an outline: substitute pµ and (Λp)µ into (2.5.24) to find L(p) and L(Λp).
Λ, the Lorentz boost, is already known. Then use Mathematica or something to find the inverse
L(Λp)−1 , and multiply to find W (Λ, p). Finally, substitute this into the explicit form (2.5.20-22) of
(j)
Dσ0 σ (W (Λ, p)), which gives the Clebsch-Gordan coefficients. The spin-projection is still considered
to be along the ẑ-axis, but now there are multiple spin-projections that could be observed.
Question 23. Suppose that observer O sees a photon with momentum p~ in the ŷ-direction and
polarization vector in the ẑ-direction. A second observer O0 moves relative to the first with velocity
~v in the ẑ-direction. How does O0 describe the same photon?
14
Jimmy Qin
Notes on Weinberg’s QFT
Answer 23. This is like the previous problem but easier, because we know the helicity is conserved.
So, O0 sees the photon having spin-1 in the ẑ-direction.
Since we chose the helicity in the ẑ-direction, we must choose the reference momentum k to be in
the ẑ-direction,
k = (E, 0, 0, E).
This is because of how helicity is defined. To find the Wigner rotation, use the explicit form of Λ
to compute that
p = (|~p|, 0, |~p|, 0), Λp = (γ|~p|, 0, |~p|, −γβ|~p|).
Note that generally, E 6= |~p|. For example, photons can be red-shifted.
Now, use eq. (2.5.44) in Weinberg.
• For p, we have
|~p|
, p̂ = (0, 1, 0).
E
Inserting these into (2.5.45) and using the canonical 2D rotation between ŷ and ẑ, gives (in
Weinberg notation, where time is the last entry in a 4-vector):


1 0
0
0
0 0 u2 +1 u2 −1 
2u
2u 
L(p) = 
0 −1
0
0 
u2 −1
u2 +1
0 0
u2
2u
u=
• For Λp, we have
γ|~p|
, p̂ = (0, γ −1 , −β).
E
The corresponding L(Λp) turns out to be a generalization of the above,


1 0
0
0
u2 −1
0 −β u2 +1 γ
γ 
2u
2u


L(Λp) = 
2
2
0 −γ − u 2u+1 β − u 2u−1 β 
u2 −1
u2 +1
0 0
u2
2u
u=
You can use mathematica to compute the inverse, L−1 (Λp). The amazing (disappointing?) result
is that
1 A
−1
W (Λ, p) = L (Λp)ΛL(p) =
.
AT B
I used block-diagonal notation. According to (2.5.32), this means that θ = 0, even though α, β 6= 0.
So, there is no phase picked up, and
U (Λ)Ψpσ =
√
γΨΛp,σ .
Why is there no relative phase θ? I guess it is because θ is the angle connected to rotations about
the ẑ-axis. However, in constructing the states Ψpσ and ΨΛp,σ , we never had to rotate about the
ẑ axis, only around the x̂-axis. So there is no relative phase picked up, although there may be a
relative phase if we chose a different momentum.
15
Jimmy Qin
Notes on Weinberg’s QFT
3
Scattering Theory
3.1
S-matrix
Question 24. Mathematically, what is the difference between an “in” and “out” state?
Answer 24. The sign of ±i on the denominator. The “in” state is called Ψ+ and the “out” state
is called Ψ− . More on this later.
Question 25. What are some properties of “in” and “out” states?
Answer 25. “In” and “out” states are taking to be non-interacting, in the sense that the wavefunction is factorizable into the product of wavefunctions describing the individual particles. The
corresponding Lorentz transformation law is
s
µ
µ
(Λp1 )0 (Λp2 )0 · · · X (j1 )
(j )
Dσ1 σ0 (W (Λ, p1 ))Dσ11σ0 (W (Λ, p1 )) · · · Ψα0 ,
U (Λ, a)Ψα = e−iaµ (p1 +p2 +··· )
1
1
p01 p02 · · ·
0 0
σ1 σ2 ···
where α = {p1 σ1 ; p2 σ2 ; · · · }, α0 = {Λp1 σ1 ; Λp2 σ2 ; · · · }
Some more properties:
1. “In” and “out” states are Heisenberg-picture states. They describe the whole history of a
system of particles, and hence do not evolve in time by themselves.
2. “In” and “out” states Ψ±
adiabatic
evolution” of the nonα could be called a “smooth
R
R
±
i dτ H −i dτ H0
interacting states Φα , through the unitary Ω(τ ) = e
e
.
Ψ±
α = Ω(∓∞)Φα .
3. For a given set of particles and momenta, the “in” and “out” states are the only possible
states. This is because there are only two ways to shift the contour, ±i. More physically,
we need |τ | → ∞, and there are two possibilities: τ → −∞ and τ → ∞. Strictly speaking
this is a little misleading, because in a scattering process the “in” and “out” states are not
in the same state α, but the point remains.
Question 26. What is the formal definition of “in” and “out” state?
Answer 26. Let H = H0 + V, and let these have eigenfunctions
H0 Φα = Eα Φα , HΨα = Eα Ψα .
Let g(α) ∈ C be a smooth “envelope” function that tells us which states are in our state vector.
Define the states
Z
Z
±
−iEα t
±
Ψg = dαe
g(α)Ψα , Φg = dαe−iEα t g(α)Φα .
If
Ψ±
g → Φg for t → ±∞
16
Jimmy Qin
Notes on Weinberg’s QFT
for every choice of g(α), then we say Ψ±
α is an “in” or “out” state. The introduction of the arbitrary
envelope g(α) is basically to make sure the phases and magnitudes of Ψ±
α and Φα match up, and
match up between different α’s as well. (For example, we can always multiply an eigenvector by
an arbitrary complex number, and it is still a good eigenvector. This construction removes that
freedom.)
The eigenvalue equation
±
(H0 − Eα )Ψ±
α = −VΨα
is not exactly invertible unless we perturb the contour. We can also add a solution of the homogeneous equation, (H0 − Eα )Φα = 0. The result is
−1
±
Ψ±
α = Φα + (Eα − H0 ± i) VΨα ,
which gives the Lippmann-Schwinger equation
Z
±
Tβα
Φβ
±
±
, where Tβα
= hΦβ , VΨ±
Ψα = Φα + dβ
α i.
Eα − Eβ ± i
Question 27. What is the S-matrix, and what is the S-operator? How to interpret them?
−
Answer 27. The S-matrix is the overlap between the “in” state Ψ+
α and the “out” state Ψβ ,
+
Sβα = hΨ−
β |Ψα i.
The rate of reaction is proportional
to |Sβα − δ(α − β)|2 , since Sβα = δ(α − β) in the noninteracting
R
∗
case. Sβα is a unitary matrix, dβSγβ Sαβ
= δ(γ − α).
The S-operator is constructed to give the same matrix elements for non-interacting states as the
S-matrix does for interacting states. It is S in the expression
Sβα := hΦβ |SΦα i.
Substituting Ψ±
α = Ω(∓∞)Φα gives
S = Ω(∞)† Ω(−∞) = U (∞, −∞).
+
It may seem funny that the definition Sβα = hΨ−
β |Ψα i of S-matrix has no time-evolution inside.
This makes sense if we think about it in Heisenberg picture, where the wavefunctions are constant
in time. In a sense, the time-evolution is already inside Ψ+
α , and we just want to see in which
“out” state it will end up in. Another way to think about it is that the effect of the potential,
−
V, is already encoded in Ψ+
α and Ψβ . Therefore, the S-matrix already accounts for the effect of
the scattering potential on incoming particles. However, Φα and Φβ do not feel the scattering
potential, which is why we have to insert the S-operator into the matrix element by hand.
Question 28. Prove the following important decomposition of the S-matrix and interpret it physically,
+
+
Sβα = δ(β − α) − 2πiδ(Eα − Eβ )Tβα
, where Tβα
= hΦβ |VΨ+
α i.
Answer 28. Before the math, let’s think about it:
• Sβα ∝ δ(Eα − Eβ ). This means that energy is conserved.
17
Jimmy Qin
Notes on Weinberg’s QFT
• The reason Sβα could possibly be different from δ(β − α) is that we are comparing apples to
−
oranges: the “in” state Ψ+
α and the “out” state Ψβ are in different bases of the same Hilbert
space. In order to compute the S-matrix element, we must switch to a common basis. In
the calculation below, I will express both of these in the Φα basis.
• Because Tβα ∝ V, Sβα = δ(β − α) in the case that V = 0.
• How to interpret the scattering term? Weinberg shows (pg 115) that
Z
+
−iEα t
+
(Φα − 2πi dβδ(Eα − Eβ )Tβα
Φβ ).
Ψα (t) → e
+
So, it’s fair to say that Tβα
encodes the part of the “in” α wavefunction that doesn’t end
up in the “out” α wavefunction after time-evolution. We know this because the “out”
wavefunction Ψ−
β exactly matches up with the non-interacting wavefunction Φβ at t → ∞.
In some sense,
Z
+
Tγα
Φγ
+
Ψα = Φα + dγ
Eα − Eγ + i
already has the structure of the S-matrix above. Even though every state in this decomposition is at the same energy (and hence evolves at the same rate), Ψ+
α simply contains different
−
states than Ψα does. Ultimately, we can trace this back to the fact that Ω(∞) 6= Ω(−∞).
I will give a different proof from Weinberg. Use
Z
Z
−
+
Tηβ
Φη
Φγ
Tγα
−
+
, Ψβ = Φβ + dη
.
Ψα = Φα + dγ
Eα − Eγ + i
Eβ − Eη − i
Just smash them together and use the property that hΦβ , Φγ i = δβγ , and that (E ± i)−1 =
1
P ∓ iπδ(E). The result is
E
+
− ∗
+
hΨ−
β |Ψα i = δαβ − iπδ(Eα − Eβ )Tβα + iπδ(Eα − Eβ )(Tαβ ) ,
+
− ∗
−
and we can argue Tβα
= −(Tβα
) by sending both Ψ+
α and Ψβ to t = ∞ or t = −∞. If this sounds
weird, it happens to be exactly what Weinberg does on page 114...
+
This gives the result. Weinberg does it by evolving Ψ+
α to t = ∞ and expressing Ψα (t → ∞) in
+
a basis of the “out” states. This is why he ended up with Tβα . You could just as well evolve Ψ−
β
− ∗
backwards to t = −∞ and express it in a basis of the “in” states. Then you would get (Tβα
) . In
some sense, what I did above is exactly intermediate between these two extremes.
3.2
Symmetries of the S-matrix
Question 29. What is meant by “Lorentz invariance of the S-matrix,” and when does it hold?
Answer 29. A theory is Lorentz invariant if there exists a unitary U (Λ, a) acting on both the “in”
and “out” states as
−
+
+
Sβα = hΨ−
β |Ψα i = hU (Λ, a)Ψβ |U (Λ, a)Ψα i.
However, such a unitary U (Λ, a) exists only for certain Hamiltonians. The problem is that “in”
and “out” states are differently defined, so we cannot be sure that U (Λ, a) acts in the same way on
18
Jimmy Qin
Notes on Weinberg’s QFT
both of them. We can solve this problem if we express both the “in” and “out” states in the same
basis, so there is no ambiguity. We use the non-interacting basis of wavefunctions, {Φα }, and let
U0 (Λ, a) induce the corresponding transformation on the non-interacting states Φ. Invariance is
satisfied if
Sβα = hΦβ |SΦα i = hU0 (Λ, a)Φβ |SU0 (Λ, a)Φα i =⇒ [U0 (Λ, a), S] = 0.
U0 is the unitary that implements the Lorentz group transforms under the non-interacting Hamiltonian, H0 . All of its generators must commute with the S-operator,
~ 0 , S] = 0.
[H0 , S] = [P~0 , S] = [J~0 , S] = [K
Weinberg (3.3.19) shows that an interacting theory is also Lorentz-invariant if
[V, P~0 ] = [V, J~0 ] = 0.
There is a discussion in the book of which of the above conditions are more easily satisfied than the
others. It is very subtle so I will not include it here. The main takeaway is that Lorentz invariance
is a property of a theory, not something that comes automatically by itself for any theory.
Question 30. How does the S-matrix encode conservation of 4-momentum?
+
Answer 30. Use Sβα = hU (Λ, a)Ψ−
β |U (Λ, a)Ψα i and
s
µ
µ
(Λp1 )0 (Λp2 )0 · · · X (j1 )
(j )
Dσ1 σ0 (W (Λ, p1 ))Dσ11σ0 (W (Λ, p1 )) · · · Ψα0 .
U (Λ, a)Ψα = e−iaµ (p1 +p2 +··· )
0 0
1
1
p1 p2 · · ·
0 0
σ1 σ2 ···
The result is that
µ
ν
0ν
ν
0ν
Sβα ∝ eiaµ Λν (p1 +p2 +···−p1 −p2 −··· ) .
But Sβα has no aµ -dependence. Therefore,
X
pνi =
X
i
p0ν
j .
j
We typically write
Sβα = δ(β − α) − 2πiδ (4) (pβ − pα )Mβα .
Physically, the way to say this is to argue that Sβα is a translationally-invariant scalar and combine
with S ∼ eipx .
Question 31. What are “internal symmetries of the S-matrix?”
Answer 31. It may be true that the wavefunctions Ψ+
α are not invariant under a symmetry, but
the S-matrix is. The spirit is similar to the treatment of Lorentz invariance. Basically, if there
is an internal symmetry T , we would like there to be a unitary U (T ) which acts the same way
on both “in” and “out” states. Similarly to the derivation above, we introduce U0 (T ) to act on
non-interacting states, and U (T ) is a symmetry of the S-matrix if
[U0 (T ), S] = 0 =⇒ [U0 (T ), H0 ] = [U0 (T ), V] = 0.
Because of the decomposition of “in” and “out” states into superpositions of non-interacting
states, we may take U (T ) = U0 (T ). This is not true for the Lorentz symmetry, because the boost
~ have to be modified for the interacting case. The argument is a little subtle (pg
generators K
119).
Here are some examples of internal symmetries.
19
Jimmy Qin
Notes on Weinberg’s QFT
• Charge conservation: Let U (T (θ))ΨQ = eiQθ ΨQ . T (θ) is the U (1) symmetry of, say,
Maxwell’s electrodynamics, and Q is the charge of the wavefunction ΨQ . When this is
implemented on an S-matrix that is invariant under Ψ → U (T (θ))ΨQ , the conclusion is that
q1 + q2 + · · · = q10 + q20 + · · · ,
so charge is conserved. This is exactly the same argument we used to show that momentum
is conserved, above.
• Time reversal: Time reversal is antiunitary, so we have to switch the order of entries in the
inner product which defines the S-matrix. More physically, time reversal switches “in” states
with “out” states:
−
+
+
hΨ−
β |Ψα i = hTΨα |TΨβ i =⇒ Sβα = ST α,T β .
We can show that the time reversal T0 in T0 Φα = ΦT α does, in fact, obey the commutations
[T, H0 ] = [T, V] = 0. For example, starting from Ψ±
α = Ω(∓∞)Φα and applying T from the
left, we get
∓
Ψ∓
T α = TΩ(∓∞)Φα = [T, Ω(∓∞)]Φα + Ω(∓∞)(TΦα ) = [T, Ω(∓∞)]Φα + ΨT α .
Therefore, [T, Ω(∓∞)] = 0. Another way to think of it is to use the fact that T is antiunitary,
so
−
TΩ(−∞)Φα = Ω(∞)ΦT α , or in other words TΨ+
α = ΨT α .
• Parity: We can show [P, H0 ] = [P, V] = 0 merely by looking at explicit forms of H0 , V and
seeing if they are invariant under ~x → −~x.
• CPT invariance: Because CPT is antiunitary, the same arguments as for T-invariance of
the S-matrix lead to
Sβα = SC PT α,C PT β .
This gives some meaning to “antiparticles are particles moving backwards in time,” because
the order of α and β had to be switched (by antiunitarity). In particular,
Mαα = MC PT α,C PT α .
By the optical theorem (see next section), the total reaction rate from an initial state of particles is the same as the total reaction rate from the same initial state, but with antiparticles
of reversed spin. For the case of one particle, this means that unstable particles have exactly
the same lifetime as their antiparticles.
3.3
Rates and cross-sections
This section is more illuminating than Schwartz, but the treatments are a bit different in interpretation, so it would be nice to read both together.
Question 32. The “master formula” of relativistic scattering theory is
(2π)3Nα −2 (4)
δ (pβ − pα )|Mβα |2 dβ.
N
−1
α
V
I will not include the derivation because Weinberg’s is exceedingly easy to follow. Interpret this
formula and explain what everything means.
dΓ(α → [β, β + dβ]) =
20
Jimmy Qin
Notes on Weinberg’s QFT
Answer 32. Here are some insights into the above formula.
• We use Γ, a reaction rate, because the production of particles in the range [β, β + dβ] is
expected to be linear with time.
• The above formula only works when we ignore the δβα part of the S-matrix. In other words,
it accounts only for the cases where a nontrivial reaction occurred. That is why we got
|Mβα |2 .
• The factor of V Nα −1 on the denominator makes sense: If we only have Nα = 1 particle
participating in the reaction, then it shouldn’t matter how large the volume of the container
is. For Nα > 1, we add an extra V −1 for every reactant particle, because they have to bump
into each other in the container.
• Why the 3Nα − 2 factor? This is from the normalization of the wavefunctions to have
probability unity of any given particle being in the box, and also the time normalization.
Essentially, we start with (2π)3(Nα −1) , and then we multiply by (2π)2 from the definition of
Mβα in terms of Sβα , and then divide by 2π because of the time delta-function. Not very
illuminating.
• How to interpret the phase-space factor δ (4) (pβ −pα )dβ? Schwartz gives a slightly different
formula for the scattering rate, because of his normalizations of the wavefunctions, and calls
his phase-space factor the Lorentz-invariant phase space factor, dΠLIPS . Weinberg’s factor, of
course, is not Lorentz invariant because of length contraction and stuff, but he does mention
dΠLIPS . See pg. 67 and 138.
R 3
A good way to think about this is: d2Ep~ is the manifestly Lorentz-invariant condition that
the external particle be on mass-shell,
Z
Z 3
d p~
4
2
2
d pδ(p − m ) =
.
2E
Question 33. How can we get a scattering cross-section σ from the scattering rate Γ?
Answer 33. Typically, people can’t smash more than two particles together at once. Thus, the
scattering cross-section is defined as the rate per flux,
dσ(α → [β, β + dβ]) =
dΓ(α → [β, β + dβ])
,
Φα
where Φα = uα /V is the flux. Here, uα is the relative velocity of the two projectiles. In general
frames, it turns out to be
1 p
uα =
(p1 · p2 )2 − (m1 m2 )2 .
E1 E2
This is a definition so that σ is a Lorentz-invariant scalar. See Weinberg pg. 138.
Question 34. Describe old-fashioned perturbation theory.
+
Answer 34. We want to calculate Tβα
, and to do so we can use Lippmann-Schwinger equation.
Multiplying on the left with hΦβ |V and defining Vβα = hΦβ |V|Φα i gives
Z
Z
Vβγ Vγγ 0 Vγ 0 α
Vβγ Vγα
+
Tβα = Vβα + dγ
+ dγdγ 0
+ ··· .
Eα − Eγ + i
(Eα − Eγ + i)(Eα − Eγ0 + i)
21
Jimmy Qin
Notes on Weinberg’s QFT
There is already a δ(Eβ − Eα ) enforced in the definition of Sβα . This is the same as second-order
perturbation theory in nonrelativistic quantum mechanics. In that case, you just take α = β and
that gives you the shift in energy to second order in V.
Question 35. Explain why time-dependent perturbation theory is said to be manifestly Lorentzinvariant. Why is old-fashioned perturbation theory not manifestly Lorentz-invariant?
Answer 35. Time-dependent perturbation theory is said to be manifestly Lorentz-invariant because the S-operator can be written
S =1+
Z
∞
X
(−i)n
n=1
n!
d4 x1 · · · d4 xn T̂ {U (x1 ) · · · U (xn )},
R
where U (xi ) is the interaction density, V (ti ) = d3~xi U (xi ), and we assume U (xi ) transforms as a
Lorentz scalar. The measure d4 x is Lorentz-invariant due to the cancellation of length-contraction
and time-dilation. Additionally, we demand that the time-ordering above is Lorentz-invariant.
• If x1 − x2 is timelike, the time-ordering between x1 and x2 cannot be changed, i.e. if t1 > t2
in one frame, then t1 > t2 in all frames. There is nothing to worry about here.
• If x1 − x2 is spacelike, the time-ordering between x1 and x2 can be flipped by appropriate
Lorentz boost. This destroys Lorentz invariance unless
[U (x), U (x0 )] = 0 for x − x0 spacelike.
This is required because the time-ordering operator T̂ commutes things past each other if
they are not in the right time-order (we are not even in second quantization yet!), and generally this introduces nontrivial commutators. We need them to vanish for the T̂ -operation
to be harmless.
The above is a rather strong condition, although the true condition for Lorentz invariance is a
little weaker.
R Old-fashionedR perturbation theory is said to not be Lorentz invariant because the
integration dγ looks like d3 p~1 d3 p~2 · · · . Even if you wanted to slip out of this trap by writing,
for example,
Z 3 3
d p~1 d p~2 · · · Vβγ Vγα E1 E2 · · ·
)
,
S⊃ (
E1 E2 · · ·
Eα − Eγ + i
it wouldn’t work because generally Eα and Eγ contain particles moving at different velocities, so
Eα and Eγ don’t transform in the same way under a boost, so the denominator Eα − Eγ + i does
not transform as a single Lorentz scalar.
Question 36. If someone asks me why people use time-ordered perturbation theory instead of
old-fashioned perturbation theory, what should I say?
Answer 36. There are a few very powerful reasons to use the time-ordered perturbation theory
instead of old-fashioned perturbation theory.
• Time-ordered perturbation theory is manifestly Lorentz-invariant, but OFPT is not.
22
Jimmy Qin
Notes on Weinberg’s QFT
• OFPT has lots of annoying denominators (for example, think about second-order perturbation theory from nonrelativistic QM). Time-ordered perturbation theory doesn’t have any
denominators in the series expansion, at least before you insert the propagators, which have
the denominators.
• Time-ordered perturbation theory gives you an extraordinarily powerful theorem – the GellMann - Low theorem – that is hard to show in OFPT.
However, old-fashioned perturbation theory can also be useful.
• If we want only the leading-order term in a process, often OFPT or Fermi Golden Rule is
much faster than using the Feynman rules. For example, if you’re doing QED to tree-level,
OFPT is way easier, since the virtual photon has only an undetermined energy, and it’s easy
to do an energy integral. One of the reasons for this is that you don’t have to remember
any Feynman rules; you just use second-quantization formalism. However, once you start to
have loops, it becomes easier to do the standard integrals from propagators.
• There is no time-ordering necessary in OFPT. You do, however, have to account for the
fact that virtual particles can be either advanced or retarded, which leads to twice as many
intermediate states (and hence intermediate energies) as you’d expect. (See Schwartz QFT
section 4.1 for an example of this. He derives static Coulomb interaction from second-order
perturbation theory on the electron-phonon vertex.)
• OFPT can give you effective Hamiltonians, as I just mentioned. Eugene Demler likes doing
this; it’s essentially how he taught us Anderson Poor Man’s Scaling. I don’t see how timeordered perturbation theory can give you an effective Hamiltonian because it seems to be only
useful for finding S-matrix elements. Of course, you can get the nonperturbative effective
Hamiltonian from path integral methods.
Finally, some good things to remember:
• A Feynman diagram has perfectly good meaning in both the time-ordered perturbation
theory and OFPT.
• Wick’s theorem holds in both formalisms.
• The idea of “field” as a distinct object is not really necessary in OFPT (or in many-body
physics). As in many-body physics, there is not a notion of a “field” φ(x) in the same sense
as we use it in relativistic QFT. I mean this in the sense that in many-body physics, we take
X
ψ(x) =
eikx ak
k
but in relativistic QFT, our field is different. We have to take
Z
1
√
(ak eikx + a†k e−ikx ).
φ(x) =
2ωk
k
This is because the “field” in many-body physics does not have to transform as an irrep under
the Lorentz group. In my opinion, you can do everything in many-body physics without the
23
Jimmy Qin
Notes on Weinberg’s QFT
idea of “field,” if you know how to work with second quantization. From this viewpoint,
ψ(x) is merely the Fourier transform of an annihilation operator, but φ(x) is a true field in
the sense that it is a nontrivial combination of creation and annihilation operators, which
are necessary to preserve causality (see the chapter on constructing fields).
• The bare Green functions are not the same in both formalisms. OFPT has the repeated
structure hα|V|βi, which involves doing lots of contractions. However, these contractions are
expectations or number operators with time-evolution. This is how we use Green functions
in many-body physics, and the “Green function,” which is really a contraction of creation
and annihilation operators, literally doesn’t care about the Hamiltonian or Lagrangian we
are using. (See the problem in section 4.4 of these notes for an example.)
Actually, I lied. It cares a little bit; usually the Green function is kind of like
Gk (t, t0 ) = in(k)e−iωk t =⇒ Gk (ω) =
n(k)
.
ω − ωk
Here, n(k) is the distribution function, which is another way of saying number operator,
which shows we are still in the second-quantization regime with no need for “fields,” per se.
Certainly this is different from the Green functions in relativistic QFT.
In contrast, in time-ordered perturbation theory, things arise from a Lagrangian. The Green
functions, or propagators, are really nontrivial and involve spin sums and weird things like
that, which care a lot about what Lagrangian they come from.
• In OFPT the “fields” have no time-dependence, only position- or momentum-dependence. In
time-ordered PT, the fields have time-dependence. You can see the difference, for example,
between the many-body theory
X
X † k2
ck +
V (q)c†k c†k0 ck+q ck0 −q
H=
ck
2m
k
kk0 q
and the relativistic theory
L = ψ̄(t, x)(i∂/ − m)ψ(t, x).
• One final note on this subtle topic: In OFPT or many-body physics, the “fields” we work
with are always pure creation or annihilation operators, either in x-space or in k-space. In
relativistic QFT, the fields are nontrivial combinations of a† and a.
3.4
Implications of unitarity
This section proves a surprising amount of interesting results. For example, Weinberg derives
Liouville’s theorem and Boltzmann’s H-theorem from unitarity, S † S = SS † = 1, a result due to
C.N. Yang. Amazing.
Question 37. What is “forward scattering?”
Answer 37. This is something that confused me for a long time, but now it’s pretty obvious.
Look at the decomposition
Sβα = δβα − 2πiδ(pβ − pα )Mβα .
The forward scattering term is the Mαα term. This makes so much sense. Essentially, the perturbation V not only scatters to different states; it can also send you back to the original state.
24
Jimmy Qin
Notes on Weinberg’s QFT
Question 38. Describe the optical theorem. What does it have to do with unitarity?
Answer 38. Weinberg derives the following optical theorem,
Im Mαα = −
kσα
uα σα
.
, or Im f (α → α) =
3
16π
4π
Here, σα is the total cross-section
Z
σα =
dβ
dσ(α → [β, β + dβ])
.
dβ
A consequence of the optical theorem is that the solid angle ∆Ω in which the scattering amplitude
f (α → β) is close to its maximum value f (α → α) shrinks with energy,
∆Ω ∼
1
.
k 2 σα
The takeaway is that higher-energy beams give better-resolved diffraction peaks. Girma Hailu
taught me this in nonrelativistic QM, so there is nothing new here and I will not include the
derivation. A good question is “what does this have to do with unitarity?” The answer is that
the derivation hinges on the relation
Z
∗
Sβα .
δ(γ − α) = dβSβγ
Question 39. Summarize the conclusions of Weinberg’s partial wave expansion.
Answer 39. Weinberg expands the wavefunction Φp~EN , where N refers to all discrete indices, in
spherical harmonics, just like in nonrelativistic quantum mechanics.
The result is that generally, there is no notion of the phase shift because different particles are
produced. However, in the case that the outgoing particles must of the same species as the
incoming particles, then the S-matrix is diagonal in N , and there is a notion of the phase shift.
This holds, for example, at very low energies (but still relativistic velocities) such that the bound
state which must form for a non-diagonal outgoing state is unattainable. Then,
Slj0 s0 n0 ,lsn (E) = e2iδjlsn (E) δl0 l δs0 s δn0 n ,
because the S-matrix must be unitary. Here, N = jlsn and δjlsn (E) is the energy-dependent phase
shift. The total cross-section turns out to be related to
X
σ∝
(2j + 1) sin2 δjlsn (E),
jls
which is the same as in nonrelativistic QM. Conclusion: the partial wave expansion applies even at
relativistic velocities. The interpretation of partial wave expansion in terms of phase shifts applies
even at relativistic velocities, if the S-matrix is guaranteed to be diagonal. This is all a result of
the unitarity of the S-matrix.
25
Jimmy Qin
3.5
Notes on Weinberg’s QFT
Resonances
This is an interesting section with a high density of important points.
Question 40. What is a resonance? What are some mechanisms for the formation of a resonance?
Answer 40. A resonant state or bound state is an intermediate state containing a single
unstable particle, R, which eventually decays into the particles observed as the final state. If R is
long-lived, then the cross-section exhibits a peak, known as a resonance, around
ECM ≈ ER ,
where ECM is the scattering energy in the CM frame and ER is the rest energy of the unstable
particle, R. The particle is said to be long-lived if the decay rate is much less than the characteristic
frequency of the particle, i.e. if
Γdecay ER /~.
There are different ways a resonance can come about.
• Strong and weak interactions: Perhaps the Hamiltonian has form
H = H0 + Vstrong + Vweak .
Both V-terms are interaction terms. The external particles asymptotically tend to Φeigenstates of H0 , but the long-lived intermediate bound state, R, is an eigenstate of
H0 + Vstrong . The Vweak perturbation allows R to decay into the outgoing particles.
An example is the decay of pions, https://en.wikipedia.org/wiki/Pion#Charged_pion_
decays. It is an eigenstate of H0 = Hkin and Vstrong = QED but not an eigenstate of the
weak force, which is interpreted as Vweak .
• A bound state can be long-lived because of a potential barrier that makes escape of constituent particles difficult. For example, α-decay of heavy nuclei has to quantum-tunnel
through the strong potential barrier due to the attractive strong force.
• Some reaction require statistically unlikely circumstances to take place. For example, if an
excited state of a heavy nucleus decays only when its energy is concentrated on a single
neutron, then there will be a long lifetime because this is unlikely.
+
?
Question 41. What does a resonance look like in the scattering amplitude Tβα
Answer 41. Regardless of the microscopic process, we can physically demand the following conditions for a resonance:
• The bound state R must decay exponentially with time, P(R) ∼ e−Γt .
• There must be a “jump” (i.e. a smoothed-out divergence) in the scattering amplitude at
Eα = ER . Physically, this means the energy of the incoming particles Eα should match the
energy of the bound state ER near the resonance.
26
Jimmy Qin
Notes on Weinberg’s QFT
Starting with the Lippmann-Schwinger equation for “in” wavefunction with envelope function g(α)
Ψ+
g (t) = Φg (t) +
Z
dαdβ
+
e−iEα t g(α)Tβα
Φβ
,
Eα − Eβ + i
we see that the integral
Iβ+ =
Z
dα
+
e−iEα t g(α)Tβα
Eα − Eβ + i
produces a resonance if
+
Tβα
∼ (Eα − ER + iΓ/2)−1 + const.
The interpretation is that a certain part of the input wavefunction becomes the resonance as
t → ∞, since not all of the wavefunction goes into the Eα = ER − iΓ/2 singularity. A good
question concerns the two-singularity structure in the above expression for Iβ+ . Is the overall
δ(Eα − Eβ ) energy conservation enforced even when the complex integration uses the Eα = ER −
iΓ/2 singularity? It is. Weinberg (pg 115) gives
Z
Z
+
+
+
−iEβ t
Ψg (t) → Φg (t) − 2πi dβe
Φβ dαδ(Eα − Eβ )g(α)Tβα
as t → ∞.
The scattering part can be rewritten as
Z
Z
+
dβΦβ dαδ(Eα − Eβ )g(α)e−iEα t Tβα
.
+
The singularity in Tβα
enforces Eα = ER and the delta function simultaneously enforces Eα = Eβ .
As you can see, not all of the wavefunction evolves into the resonance. The majority of the
wavefunction never decays at all.
Question 42. Describe how the resonance shows up in the cross-section, σ(E).
Answer 42. Unsurprisingly, there is a Breit-Wigner peak, just like in nonrelativistic QM. The
scattering amplitude in the CM frame takes the form
SN 0 N (E) = (S0 )N 0 N +
RN 0 N
.
E − ER + iΓ/2
The important result (3.8.18) for decays into two-body states is
σ(n → n0 ; E) ∝
1
Γn Γn0
×
,
2
k
(E − ER )2 + Γ2 /4
P
where Γn = Γ ls |ulsn |2 . n refers to a two-body channel, so the above refers to scattering from
one two-body channel to another. This is the classical Lorentzian form that everybody knows.
Question 43. Explain the bound on max σ(E) for a resonance.
Answer 43. Refer to the cross-section above and note that Γn ≤ Γ. We therefore have
max σ ∼ (
E
2π 2
) = λ2 ,
k
27
Jimmy Qin
Notes on Weinberg’s QFT
which means that cross-sections at a single resonance are roughly bounded by a square wavelength.
This applies in classical physics (for example, the resonance of a receiving antenna) and in quantum
physics as well. You can imagine this as follows: suppose a photon is impinging on an electron
which lives in a stationary atom. Not only does the photon have to hit the electron with the right
energy to excite it to a higher-energy state, it must also hit the electron in the right phase of its
“cycle.” In other words, the electron must happen to be hit in some sweet spot range ∆t of e−iωt
oscillations. So,
ω∆t ∼ 1 =⇒ b = c∆t ∼ cω −1 ∼ λ,
where b is the impact parameter. So σ ∼ b2 ∼ λ2 .
3.6
Problems
Question 44. Consider a theory with a separable interaction; that is,
hΦβ |V |Φα i = guβ u∗α ,
P
where g ∈ R is a coupling constant and uα ∈ C such that α |uα |2 = 1. Use the LippmannSchwinger equation (3.1.16) to find explicit solutions for the “in” and “out” states and the Smatrix.
Answer 44. The solution is
Ψ±
α = Φα +
±
Tβα
Φβ
,
dβ
Eα − Eβ ± i
Z
where
±
Tβα
= guβ u∗α + g 2
Z
uβ u∗γ uγ u∗α
+ g3
dγ
Eα − Eγ ± i
Z
uβ u∗γ uγ u∗γ 0 uγ 0 u∗α
+ ··· .
dγdγ
(Eα − Eγ ± i)(Eα − Eγ 0 ± i)
0
Not sure anybody would say this is “explicit.” I can make it look nicer by introducing a paramR
u∗γ uγ
. Here, Eα must be the initial energy of the reactants α; see https://en.
eter t± = dγ Eα −E
γ ±i
wikipedia.org/wiki/Perturbation_theory_(quantum_mechanics)#Second-order_and_higher_
corrections. So the expansion of the transfer matrix can be written
2
±
Tβα
= guβ u∗α (1 + gt± + g 2 t± + · · · ),
and folding up the geometric series gives
±
=
Tβα
guβ u∗α
.
1 − gt±
Substituting into Lippmann-Schwinger equation gives an expression for Ψ± in terms of Φ, which
is assumed to be known, like plane waves. Substituting into the decomposition of the S-matrix in
+
terms of the scattering matrix Tβα
gives the S-matrix.
dσ
Question 45. Express the differential cross-section dΩ
for two-body scattering in the lab frame,
in which one of the particles is intially at rest, in terms of kinematic variables and the matrix
element Mβα .
28
Jimmy Qin
Notes on Weinberg’s QFT
Answer 45. According to (3.4.15), the cross-section is defined
2 (4)
dσ(α → β) = (2π)4 u−1
α |Mβα | δ (pβ − pα )dβ.
Here, uα is the velocity of the incoming particle in the lab frame. Let’s assume 2 → 2 scattering,
so there are two outgoing particles. Therefore,
δ (4) (pβ − pα )dβ = δ (3) (~p01 + p~02 − p~1 )δ(E10 + E20 − E)d3 p~01 d3 p~02 .
Integrating away the momentum-conserving δ-function with d3 p~01 gives
q
q
(4)
0
02
02
02
0 2
δ (pβ − pα )dβ = δ( (~p1 − p~2 ) + m1 + p~02
2 + m2 − E)p2 dp2 dΩ.
Using δ(f (x)) = δ(x − x0 )/|f 0 (x0 )| gives
|~p0 | − |~p1 | cos θ
|~p0 |
p 2
)|~p02 |2 dΩ.
δ (4) (pβ − pα )dβ = ( p 2
+
02
02
(~p1 − p~02 )2 + m02
p
~
+
m
1
2
2
Here, |~p02 | is taken to be that magnitude which satisfies the overall δ-function. The expression is
ugly but it can be solved for. Finally,
dσ
|~p02 |
|~p02 | − |~p1 | cos θ
2
p
p
)|~p02 |2 .
(α → β) = (2π)4 u−1
+
|M
|
(
βα
α
0 2
02
02
02
dΩ
(~p1 − p~2 ) + m1
p~2 + m2
Note that there is a cos θ factor, so this is manifestly not spherically symmetric. That makes
sense, because uα picks out a preferred direction.
Question 46. Derive the time-dependent perturbation expansion (3.5.8) directly from the expansion (3.5.3) of old-fashioned perturbation theory.
Answer 46. (3.5.3) is
Z
+
Tβα = Vβα + dγ
Vβγ Vγα
+
Eα − Eγ + i
Z
dγdγ 0
Vβγ Vγγ 0 Vγ 0 α
+ ··· .
(Eα − Eγ + i)(Eα − Eγ0 + i)
(3.5.8) is
Z ∞
S =1−i
2
Z ∞
dt1 V (t1 ) + (−i)
−∞
Z t1
dt2 V (t1 )V (t2 ) + · · · .
dt1
−∞
−∞
+
The first thing to note is that Tβα
is a scalar and S is an operator (obviously, it’s called the
S-operator). It’s easiest to go backwards. Recall that
+
Sβα = hΦβ |S|Φα i = δβα − 2πiδ(Eα − Eβ )Tβα
Z ∞
Z
Z
2
= δβα − i
dt1 hΦβ |V (t1 )|Φα i + (−i)
dt1 dt2 dγhΦα |V (t1 )|Φγ ihΦγ |V (t2 )|Φβ i + · · · .
−∞
Everything here is in Heisenberg picture, so V (t1 ) = eiH0 t V e−iH0 t . This implies
Z ∞
Z ∞
−i
dt1 hΦβ |V (t1 )|Φα i = −i
dt1 ei(Eβ −Eα )t Vβα = −2πiδ(Eβ − Eα )Vβα .
−∞
−∞
29
Jimmy Qin
Notes on Weinberg’s QFT
The next term is
Z ∞
Z
−
i(Eα −Eγ )t
Z t1
dt1 e
dγ
−∞
−∞
Inserting the right convergence factor, we find
Z t1
dt2 ei(Eγ −Eβ )t2 et2 =
−∞
and then
Z
− dγ
Vαγ Vγβ
i(Eγ − Eβ − i)
Z ∞
dt2 ei(Eγ −Eβ )t2 Vαγ Vγβ .
ei(Eγ −Eβ )t1
i(Eγ − Eβ − i)
i(Eα −Eγ )t1 i(Eγ −Eβ )t1
dt1 e
e
Z
= −2πiδ(Eα − Eβ )
−∞
dγ
Vαγ Vγβ
,
Eα − Eγ + i
as desired. So, the two expansions are the same, order-by-order.
4
The Cluster Decomposition Principle
Question 47. Heuristically, what is the cluster decomposition principle and how is it encoded
in creation and annihilation operators?
Answer 47. The cluster decomposition principle is the following:
Physical processes occuring at large separation are independent.
It turns out that if the Hamiltonian is expressed as a sum of products of creation and annihilation
operators, then the S-matrix will automatically satisfy the cluster decomposition principle. This is
also why creation and annihilation operators are used in nonrelativistic statistical QM. Ultimately,
it is why we use creation and annihilation operators to introduce the idea that particle number is
not necessarily conserved.
4.1
Operator decomposition in a†q and aq
Question 48. How are many-particle states normalized?
Answer 48. We would like to normalize many-particle states in preparation for the machinery of
raising and lowering operators. If one-particle states, described by q (which includes all quantum
numbers), are normalized as
hΦq |Φq0 i = δ(q − q 0 ),
then to preserve the Bose-Einstein and Fermi-Dirac statistics, many-particle states must be normalized as
X
Y
hΦq10 q20 ···qN0 |Φq1 q2 ···qM i = δN M
(−1)FP
δ(qi − qi0 ).
P
i
Here, P is a permutation of the particles, FP is the fermionic index and tracks how many antisymmetric switches we made, and N, M are the numbers of particles in each state. Usually, only
one of the terms in this sum can be nonzero at a time. However, we must include all the possible
30
Jimmy Qin
Notes on Weinberg’s QFT
permutations because switching, for example, q10 with q20 may also give something that is nonzero.
For example,
hΦq10 q20 |Φq1 q2 i = δ(q10 − q1 )δ(q20 − q2 ) ± δ(q20 − q1 )δ(q10 − q2 ).
This still isn’t clear to me. Think about it more
Question 49. Describe the action of creation and annihilation operators.
Answer 49. Creation operators add a particle to the wavefunction; annihilation operators take
away a particle. However, because of the (anti)symmetrization, if there are identical particles, we
don’t know which one to take away, so we must account for all possibilities! So, the definitions are
a†q Φq1 ···qN = Φqq1 ···qN and aq Φq1 ···qN =
N
X
(±1)r+1 δ(q − qr )Φq1 ···qr−1 qr+1 ···qN .
r=1
The ± is for creation/annihilation operators which are bosonic and fermionic, respectively. For
more detail, see my notes on “2nd Quantization.” Both my notes and Weinberg’s book follow the
logic that we should only introduce canonical commutators after we define the action of a† , a on
wavefunctions. Weinberg’s method gives an interesting derivation of the canonical commutator:
aq0 a†q Φq1 ···qN = δ(q 0 − q)Φq1 ···qN +
N
X
(±1)r+2 δ(q 0 − qr )Φqq1 ···qr−1 qr+1 ···qN ,
r=1
a†q aq0 Φq1 ···qN =
N
X
(±1)r+1 δ(q 0 − qr )Φqq1 ···qr−1 qr+1 ···qN .
r=1
Question 50. To set up discussion of cluster decomposition and Hamiltonians, prove the following
theorem:
Any operator O can be expressed as a sum of products of creation and annihilation operators,
Z
∞ Z
∞ X
X
N 0
d ~q
dM ~qa†q0 · · · a†q0 aqM · · · aq1 CN M (~q0 , ~q).
O=
1
N
N =0 M =0
0
Here, ~q = (q1 , · · · qM ) and ~q0 = (q10 , · · · , qN
).
Answer 50. First, let’s think about what this means. We would like to reproduce the matrix
0 |O|Φq ···q i. What the above
elements of O between different wavefunctions, for example hΦq10 ···qM
1
N
0 i and |Φq ···q i differ only by their particle content, and
theorem means is that, because |Φq10 ···qM
1
N
the matrix element itself is a scalar, we can reproduce the matrix elements of any O by engineering
a combination of particle-content-changes (i.e. creation and annihilation operators), along with
some scalar coefficients CN M .
The proof follows by induction. Letting |Φ0 i be the vacuum, we have C00 = hΦ0 |O|Φ0 i. Now
for the inductive step. Suppose we have successfully replicated all matrix elements of O for
N < L, M ≤ K or N ≤ L, M < K. We would like to show that we can replicate the matrix
element for N = L, M = K.
Pick two vectors, |Φp1 ···pK i and hΦp01 ···p0L |, and compute the matrix element hΦp01 ···p0L |O|Φp1 ···pK i.
If N > L or M > K, the matrix element hΦp01 ···p0L |a†q0 · · · a†q0 aqM · · · aq1 |Φp1 ···pK i is equal to zero
1
N
because either the bra or ket is annihilated. There are three possibilities
31
Jimmy Qin
Notes on Weinberg’s QFT
• If N ≤ L and M < K, we know CN M already.
• Same for N < L and M ≤ K.
• If N = L and M = K, then the evaluation of the above gives
X
hΦp01 ···p0L |O|Φp1 ···pK i ⊃
CLK (P 0 p01 , · · · , P 0 p0L , Pp1 , · · · , PpK ).
P,P 0
0
0
· · · qL
, q1 · · · qK ). I think there is no reason that CLK for particle interchanges
This is a little different from Weinberg, who got just L!K!CLK (q1
should be equal to each other; the only restriction is on their sum, via the matrix element of O.
This follows, for example, by looking at the matrix element
Z
d2 ~q0 d3 ~qCN M (~q0 , ~q)hΦp01 p02 |a†q0 a†q0 aq3 aq2 aq1 |Φp1 p2 p3 i.
1
2
We find that
aq3 aq2 aq1 |Φp1 p2 p3 i =
3
X
(±1)r1 +r2 +r3 +3 δ(q1 − pr1 )δ(q2 − pr2 )δ(q3 − pr3 )|Φ0 i
r123 =1
=−
3
X
δ(q1 − pr1 )δ(q2 − pr2 )δ(q3 − pr3 )|Φ0 i
r123 =1
because r1 + r2 + r3 = 1 + 2 + 3 = 6, in some order. There will also be an overall (−1) from
hΦp01 p02 |a†q0 a†q0 , so there is an overall positive sign.
1
2
Conclusion: there is enough freedom to choose the matrix elements CLK (~q0 , ~q) to match the matrix
elements of O. According to my logic, the construction is not unique.
Question 51. Discuss the implications of the above theorem for various operators.
Answer 51. We consider several examples.
• Additive operators: For example, the free-particle Hamiltonian is additive because
H0 |Φq1 ···qN i = (E1 + · · · + EN )|Φq1 ···qN i.
Such an operator can be written using only the N = M = 1 term. Moreover, this is the
only way to write H0 (because H0 conserves particle number, N = M . Separability implies
N = M = 1). Thus, the free-particle Hamiltonian can always be expressed as
Z
H0 = dqE(q)a†q aq .
Momentum and ẑ-projection of the spin are also additive.
• Symmetry operators: The unitaries that implement Λµν , P, C, T can be expressed in terms
of creation and annihilation operators. The interpretation of such a unitary constructed
in creation and annihilation operators is, for example for C: annihilate all particles in the
wavefunction and then create everything again, but with the opposite charge (and perhaps
32
Jimmy Qin
Notes on Weinberg’s QFT
a phase). In fact, the behavior of wavefunctions Ψ or Φ under symmetry operations means
that creation and annihilation operators must obey very similar rules.
For example,
s
U (Λ, a)Ψpσ = e−i(Λp)·a
(Λp)0 X (j)
D 0 (W (Λ, p))ΨΛp,σ0
p0 σ 0 σ σ
and the construction of one-particle states, Ψpσ = a†pσ Ψ0 , implies
s
U (Λ, a)a†pσ U (Λ, a)−1 = e−i(Λp)·a
(Λp)0 X (j)
Dσ0 σ (W (Λ, p))a†Λp,σ0 .
0
p
σ0
We have to include a U −1 because of how we create many-particle states. Think about it!
(The vacuum is invariant, U |Ψ0 i = |Ψ0 i.)
The operators C, P, T act on creation and annihilation operators in similar ways to how
they act on wavefunctions.
4.2
Factorization of the S-matrix
Question 52. State the cluster decomposition principle in mathematical language.
Answer 52. Let there be N scattering processes occuring simultaneously but at large separations
from each other. Then the S-matrix element for the meta-process that includes all smaller processes
should factorize,
Sβ1 +···+βN ,α1 +···+αN → Sβ1 α1 · · · SβN αN as |~ri − ~rj | → ∞.
Above, αi and βi are multi-particle states. All particles in a single state, say αi , are localized around
the position ~ri in 3-space. Factorization of the S-matrix ensures a corresponding factorization of
transition probabilities, so experimental results will be uncorrelated.
Question 53. State the cluster decomposition principle in terms of the connected S-matrix,
C
Sβα
.
Answer 53. Let α and β be states containing particles. The cluster decomposition principle is
equivalent to the following:
C
Sβα
vanishes if any particle in α + β is far away from the other particles in α + β.
To summarize, cluster decomposition says that S-matrix should factorize, and that the connected
S-matrix should vanish. (Under the large-separation condition, of course.)
Question 54. Define the connected S-matrix.
Answer 54. Take the total states α and β and partition their particle content into an equal
number of clusters, where order of particles within a cluster does not matter:
α = α1 + · · · + αn and β = β1 + · · · + βn .
33
Jimmy Qin
Notes on Weinberg’s QFT
There are many different ways to partition the particles. In the above, we partitioned both states
into n clusters. It is also possible to have n = 1, which is the trivial partition, and it is not possible
to have clusters which are empty. (To see why, suppose that αi is non-empty but βi is empty. The
scattering process αi → βi automatically violates energy conservation, since the vacuum has no
C
energy. It also violates other things; for example, often [H, J 2 ] = 0.) We define Sγσ
to satisfy
X
Sβα =
(−1)F SβC1 α1 ,
all partitions
where F is the fermionic index due to the process of partitioning. The connected S-matrix is
constrained to be unique; it is a single matrix regardless of what the overall process Sβα it happens
to be participating in.
For example, we can use small particle numbers to determine the connected S-matrices iteratively:
Sq0 q = SqC0 q = δ(q 0 − q),
Sq10 q20 ,q1 q2 = SqC10 q20 ,q1 q2 + SqC10 q1 SqC20 q2 ± SqC10 q2 SqC20 q1 , etc.
Question 55. What is the physical interpretation of the connected S-matrix?
Answer 55. To get some intuition, let’s look at the expression for SqC0 q0 ,q1 q2 ,
1 2
SqC10 q20 ,q1 q2 = Sq10 q20 ,q1 q2 − (SqC10 q1 SqC20 q2 ± SqC10 q2 SqC20 q1 ).
What this means is: take the entire S-matrix and subtract away the processes that are products of
two 1 → 1 processes. Therefore, SqC0 q0 ,q1 q2 includes only scattering processes in which particles q10
1 2
and q20 nontrivially interact with each other. In general, SqC0 ···qn0 ,q1 ···qm means the particles q10 · · · qn0
1
all have to be entangled somehow, and that it is impossible to factorize the scattering process
represented by SqC0 ···qn0 ,q1 ···qm into products of smaller scattering processes. To be entangled with
1
someone else, you have to be close to them! Hence, the connected S-matrix vanishes if everyone
isn’t “in the same room,” so to speak.
0
That explains why α = q1 · · · qn must all be close together. Why must the particles in β = q10 · · · qm
be both close together and close to cluster α? If everyone went to a party in a single room, it is
impossible for one of them to apparate and re-appear 500 miles away from the party in an instant.
So, all of the partygoers are still relatively localized after they walk out of the party, and all of α
must be close to all of β. (This implicitly assumes that the scattering process α → β is relatively
fast, so particles have not had time to travel r → ∞ distance away from the scattering center.)
Question 56. Show with an example that the vanishing of the connected S-matrix for large
separations is equivalent to the factorization of the original S-matrix.
Answer 56. This is a nice example and comes from Weinberg, but I did a simpler case. Suppose
the scattering process is 123 → 10 20 30 . Further, assume that 1, 20 are close, and 2310 30 are close,
but 1, 20 are far away from 2310 30 . For simplicity, let all particles be bosonic.
We can decompose the S-matrix into connected S-matrices,
S10 20 30 ,123 = S1c0 20 30 ,123 + S1c0 20 ,12 S3c0 3 + S1c0 20 ,13 S3c0 2 · · · + S1c0 3 S2c0 2 S3c0 1 .
Most of these terms get killed. The only terms that don’t get killed are
S10 20 30 ,123 = S1c0 30 ,23 S2c0 1 + S2c0 1 S1c0 2 S3c0 3 + S2c0 1 S1c0 3 S3c0 2 = S20 1 S10 30 ,23 .
That is the result of the cluster decomposition principle for regular S-matrices.
34
Jimmy Qin
Notes on Weinberg’s QFT
Question 57. How is vanishing of the connected S-matrix for large spatial separations encoded
in its momentum-space form?
Answer 57. The Fourier transform from ~k-space to ~x-space is
Z
0
0
0
0
C
S~x0 ,~x =
ei~p1 ·~x1 · · · ei~pN ·~xN e−i~p1 ·~x1 e−i~pM ·~xM Sp~C0 ,~p .
p
~0 ,~
p
Physically, we demand that S~xC0 ,~x be invariant under any global translation,
~x0 , ~x → ~x0 + ~a, ~x + ~a.
This means the overall phase factors from the exponential is not allowed to change, and hence
there is a constraint
p01 + · · · + p0N − p1 − · · · − pM = 0.
Therefore, the vanishing of the connected S-matrix for large spatial separations is encoded in the
requirement that
X
X
Sp~C0 ,~p ∝ δ (3) (
p~0 −
p~).
In fact, the above requirement is also enforced if
X
X
X
X
Sp~C0 ,~p ∝ δ (3) (
p~01 −
p~1 )δ (3) (
p~02 −
p~2 ),
where the particle content of both incoming and outgoing states has been partitioned into two
parts. But, the two-δ-function structure is not allowed by cluster decomposition, because it implies
we could move cluster 1 very far away from cluster 2 and still have a nonvanishing matrix element.
C
Conclusion:
P 0 P Cluster decomposition implies Sp~0 ,~p has a single momentum-conserving delta function,
δ( p~ − p~).
4.3
Which Hamiltonians satisfy cluster decomposition?
Question 58. Which Hamiltonians satisfy the cluster decomposition principle?
Answer 58. It is always possible to write the Hamiltonian in raising and lowering operators,
Z
∞ X
∞ Z
X
N 0
H=
d ~q
dM ~qa†q0 · · · a†q0 aqM · · · aq1 hN M (~q0 , ~q).
1
N
N =0 M =0
The claim is that
H yields an S-matrix that satisfies cluster decomposition if (and perhaps only if?)Pthe coefficient
P
function hN M (~q0 , ~q) contains only a single momentum-conserving δ-function, δ( ~q0 − ~q).
This condition is automatically true for the free Hamiltonian H0 , so to be true for the total
Hamiltonian H it must also be true for the interaction, V. The proof is a little involved (pg.
183), but I can give the main part of the argument. In the previous section, we found that cluster
decomposition implies that
X
X
Sp~C0 ,~p ∝ δ (3) (
p~0 −
p~),
35
Jimmy Qin
Notes on Weinberg’s QFT
with no other smaller momentum-conserving δ-functions. Also, recall that
Sβα =
Z
∞
X
(−i)n ∞
n!
n=0
dt1 · · · dtn hΦβ |T̂ {V(t1 ) · · · V(tn )}|Φα i.
−∞
Consider only the terms in the calculation of the connected S-matrix,
Sβα =
Z
∞
X
(−i)n ∞
n!
n=0
dt1 · · · dtn hΦβ |T̂ {V(t1 ) · · · V(tn )}|Φα iC .
−∞
This connected matrix element is represented graphically by a Feynman diagram in which every
point is connected to every other by some path on the graph, and each vertex carries some δfunctions inside. The question is, “how many momenta do we need to fix?”
Let the connected graph have V vertices, I internal lines, and L loops; assume for simplicity that
each vertex carries exactly one δ-function, as required by the theorem. There are V δ-functions,
of which I − L are used to fix internal momenta. So, we have V − I + L δ-functions left to fix
external momenta. For a connected
Pgraph,
PEuler’s theorem gives V − I + L = 1. This is exactly
C
0
the overall δ-function in Sq~0 ,~q0 ⊃ δ( ~q − ~q).
Each term in V physically must have at least one δ-function. What this theorem says is that if
any term in V had more than one δ-function, we would overspecify the momenta of the external
particles, in violation of the cluster decomposition principle.
4.4
Problems
Question 59. Consider an interaction
Z
V=g
δ (3) (~p1 + p~2 − p~3 − p~4 ) × ap†~1 ap†~2 ap~3 ap~4 .
p
~1 p
~2 p
~3 p
~4
Here, ap~ annihilates a spinless boson of mass M > 0. Use perturbation theory to calculate
the S-matrix element for scattering of these particles in the CM frame, to O(g). What is the
corresponding differential cross-section?
Answer 59. According to (3.4.30), the differential cross-section for 2 → 2 scattering in CM frame
is
(2π)4 k 0 E10 E20 E1 E2
dσ(α → β)
=
|Mβα |2
2
dΩ
E k
0
0
0
where k := |~p1 | = |~p2 |, k := |~p1 | = |~p2 |.
First, we will calculate the S-matrix element. The nontrivial part of Sβα is
+
Tβα
= hΦβ |VΨ+
α i ≈ hΦβ |V|Φα i to O(g).
Using Φα = ap†~1 ap†~2 |0i, Φβ = ap†~0 ap†~0 |0i gives
1
2
Z
Sβα ≈ g
1234
δ(~ktot )h0|ap~2 ap~1 a~†k a~†k a~k3 a~k4 ap†~0 ap†~0 |0i.
1
36
2
1
2
Jimmy Qin
Notes on Weinberg’s QFT
There are three corresponding diagrams,
The result is
Z
Sβα /g = 4(δ(~p1 + p~2 − p~01 − p~02 ) + 4δ(~p1 − p~01 )δ(~p2 − p~02 ) + 2δ(~p1 − p~01 )δ(~p2 − p~02 )(
d3~k
)).
(2π)3
We can see the first term is connected (because there is only an overall 3-momentum conserving
δ-function) and the second and third terms are not connected. In fact, the third term does not
have any scattering of the original particles at all, so there is a divergent integral. We will just
ignore it. The first term is real scattering that can change directions and stuff; the second term
is forward scattering which contributes to scattering but preserves the directions of the original
particles.
If we are only looking for scattering not in the forward direction, then the first term is the only
relevant one. Using Sβα = δβα − 2πiδ(pβ − pα )Mβα and
+
+
Sβα = δ(β − α) − 2πiδ(Eα − Eβ )Tβα
, where Tβα
= hΦβ |VΨ+
αi
and ignoring the trivial part gives Mβα = 4g. Finally,
(2π)4 k 0 E10 E20 E1 E2
dσ(α → β)
= 16g 2
dΩ
E 2k
Question 60. A coherent state Φλ is defined to be an eigenstate of the annihilation operators
aq with eigenvalues λ(q). Construct such a state as a superposition of the multi-particle states
Φq1 ···qN .
Answer 60. The action of an annihilation operator is
aq Φq1 ···qN =
N
X
(±1)r+1 δ(q − qr )Φq1 ···qr−1 qr+1 ···qN .
r=1
Just like in the path integral formulation of, for example, superconductivity, we have
†
aq |qi = λ(q)|qi, where |qi = eλ(q)aq |0i.
For bosons, at least, the coherent state can thus be defined
Y
†
Φλ = ( eλ(qi )aqi )|0i.
i
This is because bosonic operators commute.
5
Quantum Fields and Antiparticles
This is a long chapter. Besides constructing fields, it also covers spin-statistics theorem, antiparticles, and CPT theorem. There is also a lot of interesting generality on how to construct fields of
larger representations of the Lorentz group.
37
Jimmy Qin
5.1
Notes on Weinberg’s QFT
General free fields
Question 61. What is the motivation for introducing fields?
Answer 61. We saw in the previous chapter that the foolproof way to construct a Hamiltonian
which satisfies cluster decomposition is to write it in terms of creation and annihilation operators
(with a single 3-momentum-conserving δ-function). We also saw in chapter 3 that we can guarantee
this Hamiltonian is Lorentz-invariant if the interaction satisfies
Z
V (t) = d3~xU (~x, t), where [U (x), U (x0 )] = 0 for (x − x0 )2 ≥ 0.
Therefore, we should construct U out of creation and annihilation operators. It turns out that to
guarantee such a U is a good Lorentz scalar, we should build U out of fields, which are constructed
such that under Lorentz transform, the transformation of the fields are position-independent:
X
U0 (Λ, a)ψl± (x)U0−1 (Λ, a) =
Dll̄ (Λ−1 )ψl̄± (Λx + a).
l̄
This is because adding or subtracting creation and annihilation operators, for example ap†~1 + ap†~2 ,
doesn’t work right away since ap†~1 and ap†~2 have different transformation rules under a Lorentz
transform. We need to patch this problem up by tying a† and a to some coefficients vl and ul
which make the combinations ul (x, p~σn)a(~pσn) and ul (x, p~σn)a(~pσn) transform in the same way,
regardless of p~σn. As you can imagine, the number of spin indices l tells us the spin of the particle.
The matrix Dll̄ tells us what representation we are working in. Although the representation
technically doesn’t have to be irreducible, it is easiest to work with irreps. (For example: massless
spin-1, massive spin-1/2, etc.)
Then, the interaction density U can always be written
X X X
gl10 ···lN0 l1 ···lM ψl−0 (x) · · · ψl−0 (x)ψl+1 (x) · · · ψl+M (x),
U (x) =
1
0 l ···l
N M l10 ···lN
1
M
N
which is a scalar if the coefficients gl10 ···lN0 l1 ···lM are Lorentz covariant with the representations of
the fields,
X X
−1
0 (Λ
gl̄10 ···l̄N0 l̄1 ···l̄M =
Dl10 l̄10 (Λ−1 ) · · · DlN0 l̄N0 (Λ−1 )Dl1 l̄1 (Λ−1 ) · · · DlM l̄M
).
0 l ···l
l10 ···lN
1
M
Basically, this means that under a Lorentz transform, the changes in ψ ± are absorbed into the
coupling constant g, so that the overall interaction U is invariant - a Lorentz scalar.
To summarize: “The cluster decomposition principle together with Lorentz invariance thus makes
it natural that the interaction density should be constructed out of the annihilation and creation
fields.”
Question 62. How can we construct a field?
Answer 62. The annihilation (+) and creation (−) fields are taken to be
XZ
XZ
+
3
−
ψ (x) =
d p~ul (x; p~σn)a(~pσn) and ψ (x) =
d3 p~vl (x; p~σn)a† (~pσn).
σn
σn
38
Jimmy Qin
Notes on Weinberg’s QFT
(It is a little weird to denote annihilation by (+).) From the previous chapter, we already know
the transformation rules of the creation and annihilation operators,
s
(Λp)0 X (j)
D 0 (W (Λ, p))a†Λp,σ0 .
U (Λ, a)a†pσ U (Λ, a)−1 = e−i(Λp)·a
p0 σ 0 σ σ
The idea is to tweak the transformation of ul and vl such that the fields ψ ± (x) transform in the
way we would like (see previous question).
The result (Weinberg pg 195-6) turns out to be uniquely expressible in terms of the standard
momentum, just like in Wigner little group. For massive particles (where the standard momentum
is ~k = 0), the result is
ul (x; p~σn) =
1
1
ipx
e
u
(~
p
σn),
v
(x;
p
~
σn)
=
e−ipx vl (~pσn).
l
l
(2π)3/2
(2π)3/2
If p = L(p)k, where k is the standard momentum, we find
s
s
k0 X
k0 X
~
D
(L(p))u
(
kσn),
v
(~
p
σn)
=
Dl̄l (L(p))vl (~kσn).
ul̄ (~pσn) =
l
l̄l
l̄
p0 l
p0 l
We recognize the factor
q
k0
p0
as the normalization √1
in Schwartz or Peskin and Schroeder.
q 0
Here, it is interpreted as a natural way to cancel the factor (Λp)
in the transform of the creation
p0
or annihilation operator.
2Ep~
Question 63. How does the above encode the Klein-Gordon equation?
Answer 63. Klein-Gordon equation can be interpreted as a consequence of translational invariance, from
1
1
ul (x; p~σn) =
eipx ul (~pσn), vl (x; p~σn) =
e−ipx vl (~pσn).
3/2
(2π)
(2π)3/2
Applying = ∂µ ∂ µ to ψl± gives the Klein-Gordon equation,
( − m2 )ψl± (x) = 0.
This follows because m2 = pµ pµ . Another way to think about it is that the Klein-Gordon equation
is simply a reflection of the fact that a spatial field is the Fourier transform of a momentum-space
field. For example,
Z
Z
i~
p·~
x
2
f (~x) = e f (~p) =⇒ −∇ f (~x) = p~2 ei~p·~x f (~p)
p
~
p
~
2
would satisfy a Klein-Gordon-esque equation if p~ = const.
For a photon, ψ = 0. This returns the E = c|~p| relation.
Question 64. As of now, why is the construction
X X X
U (x) =
gl10 ···lN0 l1 ···lM ψl−0 (x) · · · ψl−0 (x)ψl+1 (x) · · · ψl+M (x),
1
0 l ···l
N M l10 ···lN
1
M
N
with correctly-transforming gl10 ···lN0 l1 ···lM and (uv)l (~pσn), still not good enough?
39
Jimmy Qin
Notes on Weinberg’s QFT
Answer 64. We haven’t considered the condition [U (x), U (x0 )] = 0 for (x − x0 )2 ≥ 0. This
is a problem because [ψl+ (x), ψl−0 (y)]± 6= 0. (Here, ± means the [, ] symbol is interpreted as an
anticommutator for fermions. I use + to denote commutator and − to denote anticommutator,
but the convention in Weinberg is the other way.)
We will solve this problem by constructing the interaction out of the linear combinations ψl (x) =
κl ψl+ (x) + λl ψl− (x), where the coefficients κ, λ are chosen such that the causality condition
[ψl (x), ψl0 (y)]± = [ψl (x), ψl†0 (y)]± = 0 for (x − y)2 ≥ 0
is satisfied. This is a “causality” condition because it means that if x − y is spacelike, no signal
from x can affect measurements at y.
Question 65. How does conservation of charge imply the existence of antiparticles?
Answer 65. This is an interesting argument. If we believe that charge is conserved, we must have
[Q̂, U ] = 0.
If we construct U out of ψl (x) fields (see previous question), we must also have [Q̂, ψl (x)] =
−ql ψl (x) for some ql . Why? This means that each field carries a well-defined charge. It guarantees,
for example, that the state |lmi = ψl (x)ψm (y)|Φ0 i has a definite charge,
Q̂|lmi = Q̂ψl (x)ψm (y)|Φ0 i = −(ql + qm )ψl (x)ψm (y)|Φ0 i.
In particular, both the creation and annihilation parts of ψl (x) must carry the same charge. If
particles of charge ql are created by the creation sector, then particles of charge −ql must be
annihilated in the other sector. This is the reason for antiparticles. We will use the label n̄ for
the antiparticle of n.
In particular, U is then “charge-neutral” if the sum of charges of the fields is zero. This is easy
to guarantee if we use Hermitian conjugates. For example,
U ⊃ ψ † ψ.
This is not always the case; for example, if a field is charge-neutral, then we need only one copy
of it and do not require a Hermitian conjugate. This is like the QED coupling or the Yukawa
coupling.
In retrospect, I feel this argument is not very strong because there were a lot of assumptions
hidden in the requirement that we should have a field ψl (x) = κl ψl+ (x) + λl ψl− (x) in the first place.
But Weinberg seems to have no problem with it.
5.2
Causal scalar fields
Now we get to our first example of a real, physical field. A scalar field is one that transforms with
the scalar representation of the Lorentz group, D(Λ) = 1. Because this is theqtrivial representation,
any scalar field must have spin zero. Adjusting the normalization factor
constants in ψl (x) = κl ψl+ (x) + λl ψl− (x)) gives
+
φ (x) =
Z
d3 p~
1
p ap~ eipx and φ− (x) = (φ+ (x))† .
3/2
(2π)
2p0
40
k0
p0
(i.e. through the
Jimmy Qin
Notes on Weinberg’s QFT
Question 66. How can we enforce the causality condition and charge conservation with scalar
fields?
Answer 66. Recall that the causality condition is
[φ(x), φ(y)]± = [φ(x), φ† (y)]± = 0 for (x − y)2 ≥ 0.
The point is to choose the right coefficients in ψl (x) = κl ψl+ (x) + λl ψl− (x) such that this is true.
We can compute
1
[φ (x), φ (y)]± = ∆+ (x − y) :=
(2π)3
−
+
Z
p
d3 p~ ipx
m
p
(x − y)2 ).
e
=
K
(m
1
2p0
4π 2 (x − y)2
This is nonzero. However, it is even in xµ − y µ for (x − y)2 > 0. Therefore,
[φ(x), φ(y)]± = κλ(1 ∓ 1)∆+ (x − y), [φ(x), φ† (y)]± = (|κ|2 ∓ |λ|2 )∆+ (x − y) for (x − y)2 ≥ 0.
These can vanish only if the particle is a boson, and if |κ| = |λ|.
We can choose any relative phase between κ and λ, but once we choose it for some φ(x), we must
make it the same for every φ(x0 ), where x 6= x0 . We will choose the simplest convention κ = λ = 1.
Also, we need φ(x) to be a field of a single charge, so we will introduce a new field φc+ , φc− which
has the opposite charge. The constructed field is therefore
Z
d3 p~
+
c−
−ipx
p (ap~ eipx + apc†
φ(x) = φ (x) + φ (x) =
).
~ e
3/2
0
(2π)
2p
Conclusion: scalar fields must be spinless but do not have to be charge-neutral.
The corresponding expression in Schwartz’ QFT is eq. (2.78). He implicitly assumed that a†
creates a particle of zero charge.
5.3
Causal vector fields
Besides the trivial rep, the next-simplest representation of the Lorentz group is the vector representation, in which
D(Λ)µν = Λµν .
We will consider only massive fields here, like W ± , Z 0 . The annihilation and creation parts of the
vector field are
r
X Z d3 p~
m
+µ
µ
ipx
µ
φ (x) =
u (~pσ)ap~σ e where u (~pσ) =
L(p)µν uν (0, σ),
3/2
0
(2π)
p
σ
−µ
φ
(x) =
XZ
σ
d3 p~ µ
v (~pσ)ap†~σ e−ipx where v µ (~pσ) =
3/2
(2π)
r
m
L(p)µν v ν (0, σ).
p0
Question 67. Why must the particle represented by a vector field be spin-0 or spin-1?
41
Jimmy Qin
Notes on Weinberg’s QFT
Answer 67. The explanation is in Weinberg pg. 207-8. It’s not particularly illuminating; basically
he argues that the conditions (5.3.12-13)
X
X
u0 (0, σ̄)(J~(j) )2σ̄σ = 0,
ui (0, σ̄)(J~(j) )2σ̄σ = 2ui (0, σ)
σ̄
σ̄
can only be satisfied in two cases: (1) u0 is nonzero and ui are all zero, i.e. spin-0 (2) u0 is zero
and all the ui are nonzero, i.e. spin-1.
Heuristically, how can we think about this? TODO
Question 68. We showed in the previous question that the spin of a particle represented by a
vector field must be either 0 or 1. Describe the spin-0 case.
Answer 68. The spin-0 case is just the derivative of the scalar spin-0 field,
φµ (x) = ∂ µ φ(x).
It is therefore easy to write down its form in terms of creation and annihilation operators.
Question 69. Similar to the previous question, but describe the spin-1 case.
Answer 69. Remember that we are working with massive particles only. Also remember that
for j = 1, only the spatial components ui and v i are nonzero. See Weinberg pg 209-210 for the
complete derivation; the main point is that we can choose the field to be
XZ
d3 p~
+µ
−µ†
p eµ (~pσ)a(~pσ)eipx .
φ (x) = φ (x) =
3/2
(2π)
2p0
σ
Here, eµ (~pσ) := Lµν (~p)eν (0σ) and we can choose
1
1
eµ (0, 0) = (0, 0, 1, 0)T , eµ (0, 1) = − √ (1, i, 0, 0)T , eµ (0, −1) = √ (1, −i, 0, 0)T .
2
2
Involved in the choosing of these directions were the raising and lowering operators.
Now, we are going to enforce the causality condition by tuning the parameters in φµ (x) =
κφ+µ (x) + λφ−µ (x). The argument is the same as it was for the scalar field.
P
Introducing the matrix Πµν (~p) := σ eµ (~pσ)eν∗ (~pσ) = η µν + pµ pν /m2 gives
Z
∂ µ∂ ν
d3 p~
ip(x−y) µν
µν
+µ
−ν
e
Π
(~
p
)
=
(η
−
)∆+ (x − y).
[φ (x), φ (y)]± =
(2π)3 2p0
m2
Again, this is even in xµ − y µ for spacelike separations (x − y)2 ≥ 0. Therefore, the same
construction is valid: the spin-one particles must be bosons and we must also have |κ| = |λ|.
Again, we must conserve charge, so we introduce the antiparticle field φcµ± . The complete spin-1
vector field is
XZ
d3 p~
µ
−ipx
p (eµ (~pσ)ap~σ eipx + eµ∗ (~pσ)apc†
v (x) =
).
~σ e
3/2
0
(2π)
2p
σ
Important point: there are three polarizations (i.e. σ = −1, 0, 1) but each polarization is a 4-vector.
And we are doing a 3-dimensional integral, even though the underlying manifold is 4-dimensional.
We should try not to confuse the 3 with the 4, and know what they point to each time we use
them.
42
Jimmy Qin
Notes on Weinberg’s QFT
Question 70. Describe why the photon field cannot be derived by taking a smooth limit of the
massive spin-1 vector field.
Answer 70. There are many reasons why, but Weinberg (pg. 212) gives one here: suppose the
interaction density contains a current Jµ coupled to the massive vector field v µ ,
U ⊃ Jµ v µ .
The scattering rate is proportional to (let hJµ i be some generic matrix element):
Γ ∼ |hJµ ieµ (~pσ)|2 ∼ hJµ ihJν i∗ Πµν (~p),
and unfortunately Πµν blows up. However, the matrix element can remain finite only if the Ward
identity
pµ J µ = 0
holds.
5.4
Spinor representation of Lorentz group
Question 71. Explain why the Dirac formalism is important in physics. What does this have to
do with the Clifford algebra?
Answer 71. We saw earlier that for an infinitesimal Lorentz transform Λµν = δνµ + ωνµ , where ω is
antisymmetric, that the corresponding unitary acting on wavefunctions is
i
D(Λ) = 1 + ωµν J µν ,
2
and J µν satisfies the SO(1, 3) algebra.
It turns out that if there exists matrices satisfying the Clifford algebra {γ µ , γ ν } = η µν , then we
can construct generators
i
J µν := − [γ µ , γ ν ]
4
which satisfy the SO(1, 3) algebra. This particular representation is called the spinor representation of the Lorentz algebra. This is important because any irrep of the Lorentz group is either
a tensor, a spinor (as above), or a direct product of a tensor and spinor. More on this later.
Important: this spinor representation is not unitary because the boost generators are not Hermitian (though they are antisymmetric). Ultimately, this leads us to use
ψ̄ψ instead of ψ † ψ
as the scalar quantity, for example, in a Lagrangian.
Question 72. Describe the difference between the vector and spinor representations.
Answer 72. Any representation of the Lorentz group has the same underlying transform on spacetime, Λµν . However, the unitary D(Λ) that acts on wavefunctions depends on the representation.
Scalar representation : D(Λ) = 1.
43
Jimmy Qin
Notes on Weinberg’s QFT
Vector representation : D(Λ) = Λ.
i
µν
Spinor representation : D(Λ) = e 2 ωµν J .
D is a scalar in the scalar rep, but D is a 4×4 matrix in both the vector and spinor representations.
Interestingly, the γ-matrices themselves transform under the vector representation!
Conclusion: the vector and spinor representations of Lorentz group mentioned here tell us how
the wavefunctions transform under the Lorentz group.
Question 73. Explain why the γ-matrices can be interpreted as a 4-vector. How can we construct
tensors (i.e. generalizations of 4-vectors) from the γ-matrices?
Answer 73. There is an identity
[J µν , γ ρ ] = −iγ µ η νρ + iγ ν η µρ
which combined with D(Λ) = 1 + 2i ωµν J µν implies that
D(Λ)γ ρ D(Λ)−1 = Λρσ γ σ .
So, γ ρ transforms as a 4-vector under Lorentz transformation.
We have already constructed the antisymmetric tensor J µν = 4i [γ µ , γ ν ] by taking commutators of
the γ-matrices. In fact, we can construct higher-order, totally-antisymmetric tensors by enlarging
the commutator, which guarantees antisymmetry:
Aρστ = γ [ρ γ σ γ τ ] , P ρστ η = γ [ρ γ σ γ τ γ η] .
You can check that these satisfy the tensor transformation rule, i.e.
D(Λ)J ρσ D−1 (Λ) = Λρµ Λσν J µν .
Because we live in 1 + 3 = 4 dimensions, P has the maximum number of indices of any antisymmetric tensor. Otherwise, there would be something like Q02132 that has a repeated index.
Question 74. Give convenient, explicit forms of the Dirac matrices and the corresponding generators of the Lorentz group in spinor representation.
Answer 74. These are on pg. 216-17:
0 1
0 σj
1 0
0
j
γ = −i
, γ = −i
, γ5 =
,
1 0
−σ j 0
0 1
i σi 0
ijk σk 0
i0
ij
,J =
.
J =
0 σk
2
2 0 −σi
The raising and lowering of indices proceeds as usual with the metric tensor,
γµ = ηµν γ ν = (−γ 0 , ~γ ) and γ5 = γ 5 .
For a detailed explanation, see https://physics.stackexchange.com/questions/296772/covariant-gamm
I think that γµ = ηµν γ ν = (−γ 0 , ~γ ) may change depending on choice of metric, but γ5 = γ 5 is
independent of East Coast/West Coast convention.
44
Jimmy Qin
Notes on Weinberg’s QFT
While this spinor representation of the Lorentz group is reducible (i.e. into 2-component Weyl
spinors), the matrices γµ are irreducible in the sense that there is no proper subspace left invariant
by all the γµ . In fact, we can show the Clifford algebra must be constructed with matrices which
are 4 × 4 or larger. If the underlying space is in d = 4, there are 42 = 16 independent matrices
possible. We hope to construct all matrices out of the γµ , so we take γµ to be 4 × 4.
In fact, this is borne out in the df of the antisymmetric tensors 1, γµ , J µν , Aµνρ , P µνρη . The df in
these matrices are 1,4,6,4,1 respectively. These sum to 16. (You can think of all of these tensors
as matrices of size 4 × 4).
Question 75. Outline the following nice tricks with the γ-matrices and the tensors created from
them: (1) the parity transformation with β = iγ 0 (2) restatement of tensors using the “pseudoscalar” γ5 (3) certain transpose properties with C = γ2 β.
Answer 75. It’s good to know these exist...
1. Let β := iγ 0 . This gives a parity transform
βγ 0 β −1 = γ 0 , β~γ β −1 = −~γ , βγ5 β −1 = γ5 , βJ i0 β −1 = −J i0 , βJ ij β −1 = J ij .
The above transform property of γ5 (that γ5 switches sign under parity) is why γ5 is referred
to as a pseudoscalar. Also useful are
βγ µ† β −1 = −γ µ , βJ ρσ† β −1 = J ρσ .
2. We can make the df of the tensors constructed with (enlarged) commutators of the γ-matrices
very explicit by introducing the totally antisymmetric tensor.
1
J ρσ = ρστ η γ5 [γτ , γη ], Aρστ = 3!iρστ η γ5 γη , P ρστ η = 4!iρστ η γ5 .
4
This works because γ5 = −iγ 0 γ 1 γ 2 γ 3 , so the multiplication done in the commutators is
already inside γ5 . (To prove these, use the anticommutation {γ µ , γ ν } = 2η µν ). Neat!
3. Finally, let C := γ2 β. This will be related to charge-conjugation of a Dirac spinor. We have
T
.
C γµ C −1 = −γµT , C γ5 C −1 = γ5T , C (γ5 γµ )C −1 = −(γ5 γµ )T , C Jµν C −1 = −Jµν
5.5
Causal Dirac field
Phew! The last section was a hard one. Now we will construct a wavefunction ψ(x), called a
i
µν
Dirac spinor, that transforms under the spinor representation D(Λ) = e 2 ωµν J of the Lorentz
group. We will start with the usual plane-wave expansion
X Z d3 p~
X Z d3 p~
+
c−
ipx
−ipx
ψl (x) =
ul (~pσ)ap~σ e , ψl (x) =
vl (~pσ)apc†
.
~σ e
3/2
3/2
(2π)
(2π)
σ
σ
The regular normalization √1 0 is inside the coefficient (u, v)l (~pσ). We will calculate them after
2p
we choose the spinors (u, v)l (~kσ) at the reference momentum k = (m, ~0).
45
Jimmy Qin
Notes on Weinberg’s QFT
Question 76. Outline how we can choose the spinors (u, v)l (~kσ) at the reference momentum
k = (m, ~0). This will give us the spinors at general momenta, (u, v)l (~pσ).
Answer 76. The derivation is difficult and subtle; it’s on page 220-22. Rather than rehashing the
derivation, I will try to illuminate what the spinor representation of Lorentz group has to do with
the choice of reference spinors, (u, v)l (~kσ).
We start with the transformation of the coefficients under the Lorentz group, which is supposed
to mesh with the transformations of the creation and annihilation operators:
X
X
(j)
ul̄ (0σ̄)J~σ̄σ =
Jl̄l ul (0σ).
σ̄
l
The main point is that we found Jl̄l to be diagonal (see the previous section), which leads to
the idea of Weyl spinors. Enforcing the transformation condition on the spinors at the reference
momentum gives
u+ (0, σ)
u(0, σ) =
,
u− (0, σ)
!
u 1 ± (0, σ)
2
where u± (0, σ) =
is the Weyl spinor. Here,
u −1 ± (0, σ)
2
• ± refers to the irrep of the Lorentz group (i.e. LH or RH Weyl spinor) which is connected
to the block-diagonal form of the J µν generator we saw earlier.
• m in um± (0, σ) refers to the spin of the Weyl spinor under the Wigner rotation: its eigenvalue
(j)
with respect to J~σ̄σ . Ultimately, this came from the little-group Dσ̄σ (W (Λ, p)).
• σ refers to the spin of this Dirac spinor in the Dirac representation: its eigenvalue with
respect to the rotational part of J µν .
As described on pg. 220, we pick the convention that J~ is aligned with the rotational part of J µν .
Once we pick this convention, it means that, for example (for some undetermined constants c± ):
 
c+
0
1

u(0, ) = 
 c− 
2
0
because the Weyl spinors must carry a δmσ . Otherwise, part of the Weyl spinor would point in
the wrong direction and u(0, 12 ) would no longer be a eigenvector of the rotational part of J µν .
The spinors at general momenta are obtained as usual,
r
r
m
m
~
u(~pσ) =
D(L(p))u(0σ), v(~pσ) =
D(L(p))v(~0σ).
0
p
p0
Conclusion: the spinor representation of Lorentz group pops up in two places in our derivation of
(u, v)(~pσ): (1) determination of (u, v)(~0σ) (2) transform to finite momenta, through D(L(p)).
46
Jimmy Qin
Notes on Weinberg’s QFT
Question 77. Describe how the spinors (i.e. coefficient functions) in the previous question satisfy
the Dirac equation. Is this real physics or just a convention?
Answer 77. If you look at the spinors on the top of pg. 221, for example
 
c+

1
0

u(0, ) = 
 c−  ,
2
0
it’s clear that u(0, 12 ) is not generally an eigenvector of pµ γ µ . The thing that makes the coefficient
functions eigenvectors of the operator pµ γ µ is the convention chosen in (5.5.33).
In fact, I think the idea that the Dirac equation is a result of convention is generally true, regardless
of how you derive it (even as a “square root” of the Klein-Gordon equation). For example, see
https://quantummechanics.ucsd.edu/ph130a/130_notes/node45.html. In these notes, there
is no arbitrariness up until φ(L) and φ(R) are introduced. Another way to think of this is that scalars
only have two square roots (i.e. (±1)2 = 1), but matrices have many square roots. Therefore,
there is no unique “square root” of the Klein-Gordon equation, which is really a matrix equation.
Question 78. Outline the construction of the Dirac field, given that we already constructed the
momentum-dependent spinors, (u, v)(~pσ).
Answer 78. Again, we would like to find the coefficients in the linear combination
ψl (x) = κψl+ (x) + λψl−c (x).
Here, l indexes each entry in a Dirac spinor. We enforce causality by requiring that [ψ(x), ψ(y)] = 0
for x − y spacelike. It turns out that the causality condition implies anticommutations and hence
fermionic statistics.
Question 79. Explain the vector field ended up being bosonic and the Dirac field ended up being
fermionic. Where did the difference come from?
Answer 79. Both the vector and Dirac fields are 4-component fields. However, compare the
commutations for the vector field (pg 210-211)
[v µ (x), v ν† (y)]± = (|κ|2 ∓ |λ|2 )(η µν −
1 µ ν
∂ ∂ )∆+ (x − y)
m2
with those for the Dirac field (pg 223)
[ψl (x), ψl̄† (y)]± = (|κ|2 (−γ µ ∂µ + bu m)β∆+ (x − y) ∓ |λ|2 (−γ µ ∂µ + bv m)β∆+ (y − x))ll̄ .
We see that there are ∓ signs in both of them. However, ∂ µ ∂ ν is an even function and ∂ µ is an
odd function. That is where the difference comes from, and we need to switch a sign for the Dirac
field.
Now the question is, why does one of them have a covariant η µν − m12 ∂ µ ∂ ν operator and the other
have a Dirac operator −γ µ ∂µ + m? The answer is that both of these are “spin sums,” so to speak,
which arise from the (anti)commutations of the total fields:
X
1
Πµν (~p) :=
eµ (~pσ)eν∗ (~pσ) = η µν + 2 pµ pν ,
m
σ
47
Jimmy Qin
Notes on Weinberg’s QFT
Nll̄ (~p) :=
X
ul (~pσ)u∗l̄ (~pσ) =
σ
1
(−ipµ γµ + m)β.
0
2p
Conclusion: It seems we can determine whether a field is fermionic or bosonic merely by looking
at the numerator of the propagator (which is exactly the spin sum; see Schwartz pg. 460). If the
spin sum is even, we have a boson; if it is odd, we have a fermion.
5.6
General irreps of the Lorentz group
We now study general irreps of the Lorentz group, of which the scalar, vector and Dirac representations were examples. All fields can be constructed as direct sums of fields which are constructed
from irreps.
In fact, this is not that hard. Weinberg’s rigorous derivation is equivalent to what Girma Hailu
taught me in Physics 251b, i.e. that the Lorentz group is
SO+ (1, 3) ∼
SU (2) × SU (2)
.
Z2
Question 80. How can we label (i.e. classify) general irreps of the proper orthochronous Lorentz
group, SO+ (1, 3)?
Answer 80. The Lorentz algebra is
[Jµν , Jρσ ] = i(Jρν ησµ + Jµρ ηνσ − Jσν ηρµ − Jµσ ηνρ ),
and the point is to find matrices Jµν that satisfy this algebra. As we saw, it is not just the size
of the matrix that determines the representation; for example, both the vector and Dirac-spinor
representations are constructed from generators which are 4 × 4 matrices.
To classify general representations, we will split the six independent components of Jµν (i.e.
diagonal entries are zero because Jµν is antisymmetric under µ ⇔ ν) into two three vectors:
Angular momentum 3-vector : J~ = (J23 , J31 , J12 ),
Boost 3-vector : H~ = (J10 , J20 , J30 ).
There is a problem; namely, that [Ji , Hj ] 6= 0. We can fix this by constructing the linear
combinations
1
~ = 1 (J~ − iH~ ).
A~ = (J~ + iH~ ) and B
2
2
This decouples the 3-vectors,
[Ai , Aj ] = iijk Ak , [Bi , Bj ] = iijk Bk , [Ai , Bj ] = 0.
~ each satisfy an SU (2) algebra.
Great! The combinations A~ and B
Therefore, the representation is labeled by (A, B), where A describes the representation of the
first SU (2) and B describes the representation of the second SU (2). Both A and B are either
integers or half-integers.
48
Jimmy Qin
Notes on Weinberg’s QFT
Question 81. Discuss which generators are Hermitian and which are not, and hence which representations are unitary and which are not. Does this matter?
~ are Hermitian, and H~ is anti-Hermitian. We discovered this, for example,
Answer 81. J~ , A~, B
at the beginning of section 5.4. The fact that H~ is anti-unitary is why we had to multiply it by
~
i when constructing A~ and B.
Why is H~ anti-Hermitian? This is because there is a relative (−1) in the metric, (− + ++), and
a boost operator constructed with the boost generators must preserve the invariant −dt2 + d~x2 .
If the metric is Euclidean, (+ + ++), then H~ would be Hermitian.
Does this matter? Weinberg pg. 231: “There is no problem in working with non-unitary representations, because the objects we are now concerned with are fields, not wavefunctions, and do
not need to have a Lorentz-invariant positive norm.” This is really confusing because the Lorentz
algebra was derived on pg. 60 from unitary representations, U (Λ, a)U (1 + ω, )U −1 (Λ, a). And
besides, eq. (2.4.4) literally says that Jµν is Hermitian. It turns out that this only applies for
the transformations of wavefunctions, i.e. creation and annihilation operators. Because this is an
example of a valid representation, we can go backwards and find out what the algebra is. Because
the algebra is the same for every representation, there is no logical fallacy, even if the logic is a
bit convoluted.
Also, Schwartz makes a big deal of this. Subsection (8.1.1) in his book is literally called “Unitarity
vs. Lorentz invariance,” and he says that we would like our representations to be unitary; because
there are no finite-dimensional unitary representations of the Lorentz group, we have to go to
infinite dimensions: hence, the Wigner little group. This makes sense because we had to introduce
Wigner little group when we studied relativistic QM in Weinberg Ch 2, for which we demanded
the representation be unitary.
What is the resolution? I think the resolution is on Schwartz pg. 184, “the next step towards quantizing a theory with spinors is to use these Lorentz group representations to generate irreducible
unitary representations of the Poincare group.” On pg. 188, he says that the coefficient functions
u(~pσ) and v(~pσ) transform in a unitary way under the Poincare group with infinite-dimensional
representation.
Weinberg’s math says the same thing. According to Weinberg (5.1.6),
X
U0 (Λ, b)ψl+ U0−1 (Λ, b) =
Dll̄ (Λ−1 )ψl̄+ (Λx + b).
l̄
This turns out to give
s
U0 (Λ, b)ap~σ U0−1 (Λ, b) = ei(Λp)·b
(Λp)0 X (jn ) −1
Dσσ̄ (W (Λ, p))ap~Λ σ̄ .
p0
σ̄
q 0
The factor (Λp)
is a reflection of the non-unitarity of boosts, in the sense that its magnitude is
p0
not 1; we had to insert it in order to preserve unitarity (i.e. preserve the norm of wavefunctions).
P (j )
ei(Λp)·b is fine because its magnitude is 1, and the Wigner rotation σ̄ Dσσ̄n (W −1 (Λ, p)) is also fine
because rotations are unitary.
Conclusion: The Lorentz-group representation of ψ(x), or a field in general, need not be unitary.
To quote from earlier in the notes, “we should construct U out of creation and annihilation
49
Jimmy Qin
Notes on Weinberg’s QFT
operators. It turns out that to guarantee such a U is a good Lorentz scalar, we should build U
out of fields, which are constructed such that under Lorentz transform, the transformation of the
fields are position-independent:
X
Dll̄ (Λ−1 )ψl̄± (Λx + a).00
U0 (Λ, a)ψl± (x)U0−1 (Λ, a) =
l̄
If U is a Lorentz scalar, we’re good; it is irrelevant whether Dll̄ (Λ−1 ) is unitary. That is why, for
example, we use ψ̂(x)ψ(x) instead of ψ † (x)ψ(x).
As for the transformations of ap~σ , in fact the transform U0 (Λ, b)ap~σ U0−1 (Λ, b) listed above is unitary
but was actually fixed before the idea of field was even introduced. The point is that we need to
insert these into a field to satisfy cluster decomposition, and if we want to describe, for example,
spin-3/2 particles, we need a field that has enough spin indices to accomodate the desired creation
and annihilation operators.
Question 82. Why do we care about ( 21 , 0) ⊕ (0, 12 ) as a whole? Why not just reduce it to ( 12 , 0)?
Answer 82. While it is true that ( 21 , 0) ⊕ (0, 21 ) is a reducible representation of the proper orthochronous Lorentz group, it is not a reducible representation of the entire Lorentz group including space inversion. This is because the parity transform β gives
βA β −1 = B, βBβ −1 = A .
An irreducible (A, B) representation of the proper orthochronous Lorentz group does not provide
a representation of the entire Lorentz group unless A = B. For A 6= B, the solution is to take
(A, B) ⊕ (B, A).
5.7
CPT theorem; spin-statistics theorem
5.8
Massless fields
Question 83. Which representations of the Lorentz group are easily adapted to massless fields?
Which are more difficult?
Answer 83. The scalar and Dirac fields for m > 0 can be smoothly taken to m → 0 without
issue. The vector field cannot. This is because the vector field has some m−1 factors which blow
up (i.e. in the spin sum) but the Dirac field does not. Another way to think about it is that in
the m → 0 vector representation, at least one of the polarization vectors blows up.
Generally, it turns out that creation and annihilation operators for massless particles of spin j ≥ 1
cannot be used to construct all of the irreps (A, B) which can be constructed for massive particles.
This leads to gauge invariance.
Question 84. Mathematically, why can’t some irreps of the Lorentz group be encoded in massless
fields?
Answer 84. Suppose the field lives in the (A, B) irrep of the Lorentz group; the eigenvalues of
J~2 are a2 , b2 and we label the corresponding coefficient function by uab (~kσ). For massive particles,
50
Jimmy Qin
Notes on Weinberg’s QFT
−A ≤ a ≤ A but for massless particles, a = ±A. (Well, we will discover this below.) We showed
previously that the irreps of the Lorentz group are basically matrices D(Λ)l̄l of various size which
(A)
(B)
satisfy the right algebra. For example, (Jij )a0 b0 ,ab = ijk [(Jk )a0 a δb0 b + (Jk )b0 b δa0 a ].
The ISO(2) little-group transformation implies that for the rotation part parametrized by θ,
σuab (~kσ) = (a + b)uab (~kσ), −σvab (~kσ) = (a + b)vab (~kσ).
For the translational part parametrized by α, β,
(A)
(A)
(B)
(B)
(J − iJ )aa0 ua0 b (~kσ) = 0, (J + iJ )bb0 uab0 (~kσ) = 0.
1
2
1
2
The above line means that the massless state is in lowest angular-momentum state for A and
highest angular-momentum state for B. Put together, this implies
a = −A, b = B, σ = B − A.
This field annihilates massless particles of helicity σ and creates massless antiparticles of helicity
−σ. Because the vector representation is ( 21 , 21 ), it can only describe helicity zero. The simplest
covariant massless field with helicity one is therefore (1, 0) ⊕ (0, 1), which is the Fµν tensor.
Question 85. What are the transformations of the coefficient functions for a massless vector field,
and how does this imply gauge-invariance?
Answer 85. Recall that the massless little group is ISO(2) and the reference momentum is
k µ = (k, k, 0, 0). Because of the wavefunction transformation
s
(Λp)0 iσθ(Λ,p)
e
ΨΛp,σ ,
U (Λ)Ψpσ =
p0
we know the transformation of ap†~σ and ap~σ . If the field transforms under a certain irrep Dl̄l (Λ) of
the Lorentz group, then we also know the transformation of the coefficient functions. If S(α, β)
describes the combined rotations and boosts of the little group in the xy-plane (i.e. the “translational” part of ISO(2)), then we end up with
X
ū(~kσ) =
Dl̄l (S(α, β))ul (~kσ).
l
(There is also the R(θ) part of the little group, which is easy to implement.) We claim that the
above equation cannot be satisfied for general irreps of the Lorentz group.
For example, take the vector representation ( 12 , 12 ). Let us see why the above conditions cannot
be satisfied. Now, l = µ and we can introduce the polarization vector eµ by dividing out the
boost term,
1
uµ (~pσ) = p eµ (~pσ).
2p0
The conditions to be satisfied are (Einstein summation convention)
eµ (~kσ)eiσθ = R(θ)µν eν (~kσ) and eµ (~kσ) = S(α, β)µν eν (~kσ).
Unfortunately there is no such set {eµ (~kσ)} which satisfies these relations. You can think about
this as follows: because the vector field has spin j = 1, there should be three polarization vectors.
Unfortunately, the rotational part of ISO(2), which is eµ (~kσ)eiσθ = R(θ)µν eν (~kσ), parametrizes
rotations about a single axis and hence only allows two polarization vectors. This is why one of
the polarization vectors blows up in the m → 0+ limit of the massive vector field.
51
Jimmy Qin
Notes on Weinberg’s QFT
Question 86. How do the above relations (although they cannot all be satisfied without extra
conditions) return the Coulomb gauge of the photon field?
Answer 86. eµ (~kσ)eiσθ = R(θ)µν eν (~kσ) implies eµ (~kσ) = √12 (0, 1, ±i, 0) in (t, ~x) notation. Applying rotations and boosts gives, generally
~ · ~a(x) = 0.
e0 (~pσ) = 0, p~ · ~e(~pσ) = 0 =⇒ a0 (x) = 0, ∇
Question 87. How can we insert massless fields into a Lagrangian?
Answer 87. There are two ways. (1) Demand gauge-invariance and ensure that the vector field
aµ is always coupled to a current jµ such that ∂ µ jµ = 0. (2) construct tensor field
Fµν = ∂µ aν − ∂ν aµ
in the (1, 0) ⊕ (0, 1) representation. The relative minus sign between the two parts cancels the
undesirable extra term in the transformations of the polarization vectors, so this field satisfies the
entire ISO(2) little group. Similarly, gravitons (j = 2) can be put together in an antisymmetric
way to construct the tensor field Rµνρσ , the Riemann curvature tensor.
Massless particles with j ≥ 3 would couple to conserved tensors with three or more indices, but
because d = 1 + 3, aside from total derivatives there are none. Conclusion: “high-spin massless
particles cannot produce long-range forces.” This logic does not carry over to high-spin massive
particles because they are not constrained to couple to conserved tensors (i.e. no gauge-invariance).
High-spin massive particles are possible (and exist); they are usually very heavy.
5.9
Problems
Question 88. Consider a free field ψlµ (x) which annihilates and creates a self-charge-conjugate
particle of spin j = 23 and mass m > 0. Show how to calculate the coefficient functions uµl (~pσ)
which multiply the annihilation operators ap~σ in this field, in such a way that the field transforms
under Lorentz transformations like a Dirac field ψl with an extra four-vector index µ. What field
equations and algebraic and reality conditions does this field satisfy? Evaluate the matrix P µν (p),
defined for p2 = −m2 by
X µ
µν
Plm
(p) = 2p0
ul (~pσ)uν∗
pσ).
m (~
σ
What are the commutation relations of this field? How does the field transform under the inversions
P, C, T?
Answer 88. TODO
Question 89. Work out the transformation properties of fields of type (j, 0) ⊕ (0, j) for massless
particles of helicity ±j under the inversions P, C, T.
Answer 89. TODO
Question 90. Consider a general field ψab describing particles of spin j and mass m > 0 which
transforms according to the (A, B) representation of the Lorentz group. Suppose it has an interaction Hamiltonian
Z
V = d3~x(ψab (x)J ab (x) + h.c. ),
52
Jimmy Qin
Notes on Weinberg’s QFT
where J ab is an external C-current. What is the asymptotic behavior of the matrix element for
emitting these particles for energy E m and definite helicity? (Assume the Fourier transform
of the current has values for different a, b that are of the same order of magnitude, and that do
not depend strongly on E.)
Answer 90. Weinberg shows in section 5.7 of his book that you can construct a massive spin-j
theory quite easily. The point is that, as mentioned in https://physics.stackexchange.com/
questions/14932/why-do-we-not-have-spin-greater-than-2, the propagator becomes badly
divergent for high-spin massive particles. (However, the article I linked to mixes up massive and
massless arguments, so it’s not correct the whole way through.)
The heuristic result (see link) is that the propagator grows approximately as
∆ ∼ E j−2 .
This is the same as the asymptotic behavior of the matrix element. A more rigorous way is to note
that in Weinberg (5.7.33), the construction of a field (A, B) of spin j involves many derivatives,
which are basically like energy.
If the propagator looks like pn , then the position-space propagator looks like x−n , so these effects
become increasingly short-ranged. One way to think of this is if the exponent n is very large, then
the effect at small energies gets heavily suppressed.
(Weinberg notes in the footnote to pg. 253 that the conserved charge with the most indices in
1 + 3 dimensions is the energy-momentum 4-vector, and the corresponding current with the most
indices in 1 + 3 dimensions is the stress-energy tensor. This is important for massless particles
because they have to be gauged, and hence coupled to a conserved current. Thus, we cannot have
meaningful massless particles with j > 2, since the stress-energy tensor has two indices. This
is the graviton. Massive particles, the subject of this question, do not have to be coupled to a
conserved current. They may just as well be coupled to a scalar current, which they are in this
question.)
6
Feynman Rules
Feynman does rule! Feynman, Schwinger, and Tomonaga developed in the late 1940s a method
of perturbation theory which was Lorentz invariant to all orders. In this chapter, we will derive
Feynman rules with Dyson’s method (1949). First we will derive the position-space rules; then we
derive the momentum-space rules.
6.1
Dyson’s derivation in position space
Question 91. Describe Dyson’s (1949) derivation of the Feynman rules from perturbation theory.
Answer 91. The procedure is extremely similar to the first problem I solved in section 4.4 of these
notes. The S-matrix from initial state α = p~1 σ1 n1 , p~2 σ2 n2 , · · · to final state β = p~01 σ10 n01 , p~02 σ20 n02 , · · ·
is given by the Dyson series
∞ Z
X
Sβα =
d4 x1 · · · d4 xN hβ|T̂ {U (x1 ) · · · U (xN )}|αi,
N =0
53
Jimmy Qin
Notes on Weinberg’s QFT
where |αi = ap†~1 σ1 n1 ap†~2 σ2 n2 · · · |0i, and similarly for |βi. The interaction densities U (xi ) are built
out of fields, which are built out of creation and annihilation operators just like the states |αi and
|βi.
We (anti)commute all annihilation operators to the RHS and all creation operators to the LHS.
There may be nonzero (anti)commutators which pop up as a result. In fact, only the product
of these nonzero (anti)commutators will survive, because aq |0i = 0 for any species q. Therefore,
every operator in this Dyson series must be contracted with another in order to survive. For
example (recall that []± stands for commutator and anticommutator, respectively):
[ap~σn , ψl† (x)]± =
e−ipx ∗
u (~pσn).
(2π)3/2 l
The sum of all such contractions gives the Feynman diagrams.
Question 92. List the position-space Feynman rules.
Answer 92. To calculate the contribution to the S-matrix for a given process of order Ni in each
of the interaction terms Ui (x), this is the procedure:
1. Draw all Feynman diagrams containing Ni vertices of each type i and containing a line
coming into the diagrams from the left for each particle or antiparticle in the initial state,
and a line leaving the diagram from the right for each particle or antiparticle in the final
state. Draw any number of internal lines running from one vertex to another, as required to
give each vertex the proper number of attached lines.
2. The lines of any particle have an arrow pointing to the right (positive time); the lines of any
antiparticle have an arrow pointing to the left.
3. Each vertex gets a factor −igi . It is customary to redefine the coupling such that, for
example, the factor corresponding to the interaction gφ4 is g/4!.
Particles and antiparticles leaving the diagram get factors
e−ipx
e−ipx ∗ 0 0 0
ul (~p σ n ) and
vl (~p0 σ 0 n0 ), respectively.
3/2
3/2
(2π)
(2π)
Particles and antiparticles entering the diagram get factors
eipx
eipx ∗
u
(~
p
σn)
and
v (~pσn), respectively.
l
(2π)3/2
(2π)3/2 l
A (anti)particle which doesn’t interact with anything gets a factor δ (3) (~p0 − p~)δσ0 σ δn0 n . An
internal line connecting two vertices in the diagram gets a factor
−i∆lm (x − y).
4. Perform the integral
R
d4 x1 · · · d4 xN and add up the results from each Feynman diagram.
There are sometimes extra (−1) or symmetry factors related to fermion loops and additional
scattering symmetries. The safest thing to do is just to start from Dyson’s formula and see how
many factors come out.
54
Jimmy Qin
6.2
Notes on Weinberg’s QFT
Propagators
Question 93. Describe the propagator of an arbitrary field ψl (x). What does it mean?
Answer 93. Let the annihilation and creation parts of an arbitrary field ψl (x) be
XZ
XZ
+
3
−
ψ (x) =
d p~ul (x; p~σn)ap~σn and ψ (x) =
d3 p~vl (x; p~σn)ap†~σn .
σn
σn
Such a propagator arises from Dyson’s formalism when the interactions Ui are written in normal
†
ordered form, and we would like to (anti)commute ψl (x) from Ui (x) past ψm
(y) from Uj (y).
(Normal-ordering each interaction guarantees there are no contractions inside an interaction. We
†
(y) instead of ψm (y) because TODO). There are three cases to consider:
choose ψm
1. x0 > y 0 and x − y is spacelike: Because time-ordering means “later on left,” and we are
trying to put creation fields on the left, it must be that (before we commute anything) the
†
†
(y) to the left of ψl (x);
(y). We should therefore move the creation part of ψm
order is ψl (x)ψm
the nonvanishing part is
+†
[ψl+ (x), ψm
(y)]± .
2. x0 < y 0 and x − y is spacelike: Same thing, except we select the creation part of ψl (x).
−†
[ψm
(y), ψl− (x)]± .
3. x − y is timelike: In previous work, we said that Lorentz invariance of the S-matrix means
†
that we must have [ψl (x), ψm
(y)] = 0 for timelike separations. We will guarantee this is true
based on the coefficients on the fields.
The result is what we will define as the propagator,
+†
−†
−i∆lm (x − y) := θ(x0 − y 0 )[ψl+ (x), ψm
(y)]± ± θ(y 0 − x0 )[ψm
(y), ψl− (x)]± .
The relative ± between the two terms arises because the T̂ operator does not produce any commutators but it does produce negative signs.
The propagator ∆lm (x−y) means (for x0 ≥ y 0 ) either a particle was created at y and destroyed at x
or an antiparticle was created at x and destroyed at y. You can think of the magnitude of ∆(x−y)
as how easily this process occurs. For example, ∆lm (x − y) could be like a Coulomb potential; the
1
δ(t − rc ) and falls off with distance r = |~x − ~y |.
Green function of the D’Alembertian operator is 4πr
Question 94. For general fields, how can we calculate the propagator?
Answer 94. This is a little involved. Here it goes...
From the previous question,
+†
−†
−i∆lm (x − y) := θ(x0 − y 0 )[ψl+ (x), ψm
(y)]± ± θ(y 0 − x0 )[ψm
(y), ψl− (x)]±
Z
Z
d3 p~ X
d3 p~ X ∗
0
0
∗
ip(x−y)
0
0
= θ(x − y )
u
(~
p
σn)u
(~
p
σn)e
±
θ(y
−
x
)
vm (~pσn)vl (~pσn)eip(y−x) .
l
m
3
3
(2π) σ
(2π) σ
55
Jimmy Qin
Notes on Weinberg’s QFT
The spin sums are
X
σ
p
Plm (~p, p~2 + m2n )
p
ul (~pσn)u∗m (~pσn) =
and
2 p~2 + m2n
X
p
Plm (−~p, −
∗
p
vl (~pσn)vm
(~pσn) = ±
2
2
σ
p~2 + m2n )
p~ + m2n
.
Here, Plm (~p, ω) is a polynomial in p~ and ω. Apparently Plm has an general explicit form, (6.2.7).
Generally, it is the numerator of the propagator, so we know what it looks like for scalars, Dirac
fields, and vector fields. From now on, we will simply call Plm (~p, ω) = P (p).
R d3 p~ eipx
If we use ∆+ (x) = (2π)
3 2p0 , we find
−i∆lm (x − y) = θ(x0 − y 0 )P (p)∆+ (x − y) + θ(y 0 − x0 )P (p)∆+ (y − x).
Now (6.2.10), Weinberg does something sneaky and extends the definition of P (p) to momenta
which are off-shell; he calls this P (L) (p). Substituting p = −i∂x and using derivative product rule
gives
∆lm (x − y) = P (L) (−i∂x )∆F (x − y),
0
0
where −i∆
F (x) := θ(x )∆+ (x) + θ(−x )∆+ (−x) is the Feynman propagator. Now we use
R
−iωt
e
dω
−1
, which gives
θ(t) = 2πi
ω+i
Z
∆F (x) =
d4 q
eiqx
=⇒ ∆lm (x − y) =
(2π)4 q 2 + m2 − i
Z
(L)
d4 q Plm (q)eiqx
.
(2π)4 q 2 + m2 − i
Of course, this is the Fourier transform of the momentum-space propagator, which is (6.3.3)
−i
Plm (q)
.
4
2
(2π) q + m2l − i
Question 95. How is the Feynman propagator, with the generic
1
k 2 − m2
structure, related to the propagators we use in condensed matter field theory, which have the
generic
1
ω − k
structure?
Answer 95. The Feynman propagator has both retarded and advanced propagators hiding inside.
This is due to how we define the fields; for example, φ(x) is a linear combination of creation and
annihilation operators.
Roughly speaking, the sum of retarded and advanced propagators
1
1
1
+
→ 2
ω − k −ω − k
ω − 2k
gives the Feynman propagator, where 2k = k2 + m2 . This is also why we only need to draw a single
Feynman diagram to represent all combinations of advanced and retarded processes. Otherwise
we would have to care about which vertices are more to the right or to the left, which direction is
the photon line going in, etc. See section 5.1 of these notes.
56
Jimmy Qin
6.3
Notes on Weinberg’s QFT
Problems
Question 96. Consider
the theory of a real scalar field φ with interaction (in the interaction picR
ture) V (t) = 3!g d3~xφ(x)3 . Calculate the connected S-matrix element for scalar-scalar scattering
to O(g 2 ), doing all integrals. Use the results to calculate the differential cross-section for this
process in the CM reference frame.
Answer 96. Scalar-scalar scattering means 2 → 2 scattering. We use the Dyson formula described
in section 6.1 of these notes,
∞ Z
X
Sβα =
d4 x1 · · · d4 xN hβ|T̂ {U (x1 ) · · · U (xN )}|αi,
N =0
where |αi = ap†~1 ap†~2 |0i, and similarly |βi = ap†~3 ap†~4 |0i.
It should be clear that when we smash the field
Z
1
p
(a~k eikx + a~†k e−ikx )
φ(x) =
2E~k
k
into the state |αi, for example, we will get an external factor of
1
eikx
p
.
(2π)3/2 2E~k
Here are the possible diagrams:
Using the position-space Feynman rules, we find that
Z
(ig)2
√
Sp~3 p~4 ,~p1 p~2 =
(−i∆(x, y))(eix(p1 −p2 )−iy(p3 −p4 ) +eix(p1 −p3 )−iy(p2 −p4 ) +eix(p1 −p4 )−iy(p2 −p3 ) ).
(2π)6 16E1 E2 E3 E4 xy
Using the momentum-space Feynman rules, we find that Sp~3 p~4 ,~p1 p~2 = −2πiδ(p0 − pf )Mp~3 p~4 ,~p1 p~2 ,
where
(ig)2 /(2πi)
i
i
i
√
+
+
.
Mp~3 p~4 ,~p1 p~2 =
(2π)6 16E1 E2 E3 E4 (p1 + p2 )2 − m2 + i (p1 − p3 )2 − m2 + i (p1 − p4 )2 − m2 + i
Introducing the Mandelstam variables s, t, u (we don’t even need to know which is which) and
using the CM differential cross-section in Weinberg (3.4.30) gives
2
dσ
g 4 |~k 0 |
1
1
1
=
+
+
.
2
2
dΩ
(2π)10 E 2 |~k| s − m + i t − m + i u − m2 + i
Here, |~k| and |~k 0 | are the momenta of either of the scalar particles before and after the collision,
and E is the total energy (which is conserved).
57
Jimmy Qin
Notes on Weinberg’s QFT
Question 97. What is the contribution in Feynman diagrams from the contraction of the deriva†
(y) of the field?
tive ∂µ ψl (x) of a Dirac field with the adjoint ψm
Answer 97. You just take the regular propagator
Z
∆lm (x − y) =
(L)
d4 q Plm (q)eiq(x−y)
(2π)4 q 2 + m2 − i
and apply ∂µ . So the momentum-space contribution has an extra iqµ or something, but it looks
basically like a propagator.
Question 98. Use Gell-Mann - Low theorem to give expressions for the
R 3vevs of3 these Heisenbergg
picture operators: Φ(x), T̂ {Φ(x)Φ(y)}. Use the interaction V (t) = 3! d ~xφ(x) and calculate the
answer to O(g) and also O(g 2 ).
Answer 98. Take the Lagrangian density to be (in Schwartz notation with + − −− metric)
1
1
g
L = (∂µ φ)(∂ µ φ) − m2 φ2 + φ3 .
2
2
3!
Gell-Mann - Low theorem basically says we only need the connected diagrams, because the disconnected ones will be divided out from the denominator. As far as I can tell, these are the only
diagrams:
and let us do hΦ(x)i first. We expect this to be independent of x. Interpreting the diagram gives
Z
hΦ(x)i = ig ∆(x, y)∆(y, y) = 0.
y
This is independent of x because we are effectively integrating over all possible values of x − y.
We could guess it’s if g is small enough such that there is no spontaneous symmetry breaking.
The mathematical reason is that ∆(y, y) = 0 since it is neither advanced nor retarded.
The other one is just
hΦ(x)Φ(y)i = ∆(x, y) − g
2
Z
∆(x, z1 )∆(z1 , z2 )2 ∆(z2 , y).
z1 z2
In fact, there is another diagram at O(g 2 ): the Hartree diagram. However, this is usually taken
to be zero because ∆(x, x) = 0 since it is neither retarded nor advanced. Or if you have reason to
think ∆(x, x) is not zero, you can add the Hartree contribution.
58
Jimmy Qin
7
Notes on Weinberg’s QFT
The Canonical Formalism
The meaning of canonical formalism is: postulate a Lagrangian and apply the rules of canonical
quantization; i.e. with canonical (anti)commutations. This is historically how quantum field
theory was developed.
We saw in the previous chapter that we can calculate S-matrix elements using the Hamiltonian
formalism. However, in this chapter we introduce the Lagrangian formalism. One advantage
is that the Lorentz-invariance of the classical Lagrangian guarantees Lorentz-invariance of the
quantum theory. Another advantage is that there will be no ad hoc non-scalar terms which must
be added to the interaction density (for example, Aµ Jµ for a massless photon) to compensate for
non-covariant terms in the propagators. These will pop out automatically from the Lagrangian
formulation.
7.1
Canonical variables
Question 99. What is “canonical” about the fields we constructed in previous chapters, ψl (x)?
Answer 99. The free fields we constructed previously are “canonical” in the sense that they
satisfy the canonical commutations. Schematically,
[x, p] = i, [x, x] = [p, p] = 0.
We found previously that, for instance, the real scalar field satisfies
˙ x, 0) = −iδ 3 (~x).
[φ(x), φ(y)] = ∆(x − y) and ∆(~x, 0) = 0, ∆(~
R d3~k eikx −e−ikx
. Therefore, the field and
The above properties follow from the definition ∆(x) = (2π)
3
2k0
its time derivative (i.e. the conjugate momentum) satisfy
[φ(~x, t), φ̇(~y , t)] = iδ(~x − ~y ).
This follows from differentiating with respect to y 0 .
Question 100. Describe how different kinds of fields can be interpreted as conjugate pairs of
canonical variables.
Answer 100. The case of the real scalar field was done above. For the complex scalar field, the
commutation was [φ(x), φ† (y)] = ∆(x − y), so we have to change the canonical momentum to
(q, p) = (φ, φ̇† ).
Or, you can decompose into real and imaginary parts and make both of them canonical fields,
(q, p) = (φ1 , φ̇1 ) and (φ2 , φ̇2 )
where φ = √12 (φ1 + iφ2 ).
For the spin-1 vector field v µ , it turns out that v 0 can be expressed in terms of the other variables,
so there are only three canonical variables, v i . (This follows from ∂µ v µ = 0, which essentially
59
Jimmy Qin
Notes on Weinberg’s QFT
removes one df. This is because spin-1 only really needs three df, while a vector field has 4 df.)
We may take
∂ i · pi
(q i , pi ) = (v i , v̇ i + ∂i v 0 ) and v 0 =
.
m2
For the Dirac field, we take
(q, p) = (ψ, iψ † ).
The factor of i is the factor in [x, p] = i.
Conclusion: the identification of fields as canonical variables, and their conjugates, must be done
carefully. Sometimes you must take a time-derivative (scalar and vector fields) and sometimes you
don’t have to (Dirac field). (This is because the anticommutator of ψ(x) and ψ † (x) already had a
derivative, ∂/ .)
Question 101. In a canonical formalism, describe how a functional derivative is mapped to a
commutator.
Answer 101. This is so pretty! If F [q, p] is bosonic, then it doesn’t matter whether q and p are
bosonic or fermionic; we have
δF
δF
= i[p, F ] and
= i[F, q].
δq
δp
This is a generalization of Hamilton’s equations of motion. Beautiful!
This can be motivated by imagining F is normal ordered, with all qs on the left and all ps on the
right. However, the above is a definition that resolves ambiguities when F may not be normalordered.
Question 102. Transform the free-field Hamiltonian from second-quantized notation into the
canonical variables.
Answer 102. Consider a real scalar field. The free-field Hamiltonian is
Z q
~k 2 + m2 a† a~ .
H=
~k k
~k
Recall that the expression for the real scalar field is
Z
1
√
(a~k eikx + a~†k e−ikx ).
φ(~x, t) =
~k
2k 0
Up to the zero-point energy, this turns out to be
Z
1
~ 2 + m2 q 2 .
H=
p2 + (∇q)
2 ~x
(Schematically, you can see this by noting
Z
1
~ 2 + m2 q 2 ∼ 1 ( 1 ((k 0 )2 + (~k)2 + m2 )) ∼ k 0 ,
p2 + (∇q)
2 ~x
2 k0
p
since k 0 = ~k 2 + m2 .) I feel this is not very nice; for example,
it’s not clear to me that this
R p
expression H[p, q] is even the only one that returns H = ~k ~k 2 + m2 a~†k a~k . However, Weinberg’s
logic is that the second-quantized form of H is the one which is certain, and we are just trying to
find a way to write it with qs and ps.
60
Jimmy Qin
Notes on Weinberg’s QFT
Question 103. Transform the Hamiltonian from the previous question, H[q, p], to a Lagrangian.
Show it returns the usual scalar Lagrangian of relativistic QFT.
Answer 103. We use the Legendre transform
Z
L = p(x)q̇(x) − H.
~
x
This gives
Z
1 ~ 2 1 2 2
1
1
− m q )=−
L = (pq̇ − p2 − (∇q)
2
2
2
2
~
x
Z
(∂µ φ∂ µ φ − m2 φ2 ).
~
x
Question 104. Show that the canonical variables of a free theory can be recast into the canonical
variables of an interacting theory via similarity transformation.
Answer 104. Let Q, P be the canonical variables in the interacting theory and q, p be the canonical variables in the free theory. Consider the similarity transforms which preserve the canonical
commutators:
Q(~x, t) = eiHt q(~x, 0)e−iHt and P (~x, t) = eiHt p(~x, 0)e−iHt .
Here, H is the full Hamiltonian; these transforms commute with H, so if the interacting Hamiltonian is expressed in the free variables as H = F [q, p], it can also be written H = F [Q, P ].
What does this mean? It means the canonical variable in the interacting Hamiltonian is a timeevolution of the canonical variable in the free Hamiltonian. First of all, when exactly we start the
time evolution is irrelevant because the start of time can always be changed by arbitrary phase
shift. (Here, the start time is t = 0. However, once we pick a start time , we must be consistent
over all canonical variables.) Second, we have to evolve with the entire Hamiltonian. While it
may be true that
eiH0 t qe−iH0 t = q,
it is not true that we can merely evolve with the interaction V in H = H0 + V. This is because
generally [H0 , V] 6= 0, so the exponential may imply H0 , the free Hamiltonian, has some nontrivial
effect on the “free” canonical variable.
7.2
Lagrangian formalism
It turns out that the best way to choose the Hamiltonian of a theory is to choose the Lagrangian
first, then Legendre-transform to find the Hamiltonian. It what follows, let capital fields Ψ and
Π represent interacting (conjugate) fields; the lowercase ψ and π are the free (conjugate) fields.
Careful! These fields are no longer canonical in general. Even though we can define the Hamiltonian equations of motion, and hence describe which fields are conjugate to each other, they do not
generally satisfy the canonical (anti)commutations. That is a product of special choices like the
ones in the previous section. So there is a different between having a conjugate friend and having
a conjugate friend who is also canonical.
Question 105. Describe the motivation leading to the Euler-Lagrange equations of motion.
Answer 105. From this viewpoint, the Euler-Lagrange equations are merely restatements of the
equations of motion for the conjugate fields Π(~x, t). How?
61
Jimmy Qin
Notes on Weinberg’s QFT
The definition and equation of motion for Π is:
δL
δL
and Π̇(~x, t) =
.
δΨ(~x, t)
δ Ψ̇(~x, t)
Π(~x, t) =
(The equation of motion is a restatement of Hamilton’s equation, but with a negative sign because
H = pq̇ − L. The definition is motivated by, RI guess, making the below action stationary?
Heuristically, if Π̇δΨ = δL is true, and we want dtL to be stationary, then we have to absorb
this equation of motion into a total time derivative on the LHS, which means adding the term
δL
δL
) = δΨ(~
, which is just the stationarity condition for the action
Πδ Ψ̇.) This means that ∂t ( δΨ̇(~
x,t)
x,t)
Z
I[Ψ, Ψ̇] =
dtL[Ψ, Ψ̇].
So now we have an action I[Ψ, Ψ̇].
We expect I[Ψ, Ψ̇] to be a Lorentz Rscalar because it gives the equations of motion, which
R 4 should
be Lorentz-covariant. Because I = dtL, for Lorentz scalar we expect there to be a d x, which
implies
Z
L = d3~xL ,
where L [Ψ, ∂µ Ψ] is the Lagrangian density, which itself is also a Lorentz scalar (i.e. because
the measure d4 x is a Lorentz scalar). We guess it is a function of ∂µ Ψ because that has all the
indices.
δL
δL
) = δΨ(~
can now be expressed as a stationarity
The stationarity condition of the action ∂t ( δΨ̇(~
x,t)
x,t)
condition of L, which turns out to be the Euler-Lagrange equation
∂µ (
∂L
∂L
)=
.
∂(∂µ Ψ)
∂Ψ
Conclusion: The Euler-Lagrange equation just above is a restatement of the Hamilton equation
of motion for the conjugate field, under the assumption that the Lagrangian can be written as a
spacetime integral of some Lagrangian density.
Question 106. Show that the Legendre transform back to the Hamiltonian eliminates the timederivative field Ψ̇.
Answer 106. Consider
Z
H=
Π(~x, t)Ψ̇(~x, t) − L[Ψ(t), Ψ̇(t)].
~
x
= 0, so generally H = H[Ψ, Π]. In fact, the same equations of motion are
By definition of Π, δH
δ Ψ̇
satisfied as in the canonical case,
δH
δH
Ψ̇ =
, Π̇ = −
.
δΠ
δΨ
However, as I mentioned in the introduction to this section, although Ψ and Π are conjugate, they
may not be canonical conjugates. Sometimes you can force them to be canonical by imposing the
(anti)commutations. Then, the functional derivative could again be interpreted as a commutator
with the bosonic energy H (see the previous section).
62
Jimmy Qin
Notes on Weinberg’s QFT
In fact, imposing such (anti)commutations is not always possible. For example, we saw that
iψ † (x), the Hermitian conjugate of the Dirac field, is not a canonical field variable (in fact, it is
more like a conjugate variable), and thus its time-derivative does not appear on L. So fields are
not always mapped to field variables, as we will see in the next question.
Question 107. Describe the canonical quantization of a general theory.
Answer 107. As we saw in the previous question, some fields (such as the time component of
vector field or the Dirac field) appear in the Lagrangian, but without their time-derivatives. So,
these fields have no canonical conjugates. Let us denote them by C r .
The fields which do appear with their time derivatives can have canonical conjugates. Let us
denote them, and their time-derivatives, by Qn and Q̇n .
The canonical conjugates of the fields Qn are defined from the Lagrangian L[Q, Q̇, C]:
Pn (x) =
δL
δL
δL
,0 =
=
.
δC r
δ Q̇n
δ Ċ r
∂L
∂L
∂L
,0 =
=
.
∂C r
∂ Q̇n
∂ Ċ r
The Legendre transform to the Hamiltonian involves only the P and Q variables, because the C
variables have no conjugates:
Z
Z
3
n
H = d ~x(Pn Q̇ ) − L[Q, Q̇, C] = d3~x(Pn Q̇n − L ).
Pn (x) =
Canonical quantization means “express everything in the canonical variables.” In other words,
given the equations above, we have enough information for solve for C r and Q̇l in terms of Qn and
Pn . Then the Hamiltonian is a functional only in the canonical variables,
H = H[Q, P ].
Question 108. For a given theory which has been canonically quantized, how do we transition
from Heisenberg picture to interaction picture?
Answer 108. Recall that in the previous question regarding canonical quantization, all fields Q
and P were in the Heisenberg picture. We would like to switch to the interaction picture because
that is how we compute S-matrix elements (to remember why, recall that the propagators of fields
are calculated for H0 and the interactions are treated as perturbations. This implies the fields are
evolving via H0 , hence we must have been in interaction picture. Obviously, the field φ− (x) has
an e−iωt attached to it. Of course, this is post-mortem justification; the Feynman rules already
assume this).
To switch back to interaction picture, in which the fields evolve only with H0 instead of the full
Hamiltonian H, we simply “undo” the similarity transformation described in the previous section.
In fact, q = Q and p = P at t = 0, so we just rewrite
HHeisenberg [P, Q] → Hinteracting [p, q].
The functional form is exactly the same. Then because operators evolve with H0 , obviously
H0 (t) = H0 (t = 0) and V(t) = eiH0 t V(t = 0)e−iH0 t .
Unfortunately, the exact details of finding, for example, the expansion of q(x) into ap†~σ and ap~σ
operators is different for every theory. See Weinberg pg. 304-5.
63
Jimmy Qin
7.3
Notes on Weinberg’s QFT
Global symmetries and Noether theorem
According to Weinberg, the Lagrangian formalism “provides a natural framework for the quantummechanical interpretation of symmetry principles.” What he means is that Noether’s theorem,
i.e. for global symmetries, is easiest to see in the Lagrangian formalism. This is because the
Lagrangian formalism involves a stationarity condition (small change in the field), which is easily
adapted to the global rotation of Noether’s theorem. This is described below.
Question 109. Give a short derivation of Noether’s theorem for global symmetries.
Answer 109. Suppose there is a global ( constant) transform
Ψ(x) → Ψ(x) + iF (x)
which preserves the action. (Even when the action is not minimized, or equivalently when the
dynamical equations are not satisfied. For example, L = ψ̄(∂/ − m)ψ is invariant under ψ → eiθ ψ
even if (∂/ − m)ψ 6= 0.) Now consider the local transform (here, F is known)
Ψ(x) → Ψ(x) + i(x)F (x).
The action no longer vanishes; however, it can always be written in the form
Z
Z
4
µ
δI = − d xJ (x)∂µ (x) = d4 x∂µ J µ (x)
which forces it to vanish when (x) = const. If we choose the fields to satisfy the field equations
(i.e. if we choose a stationary point), then δI = 0 regardless of the local (x). Therefore, the
current J µ is conserved when the Euler-Lagrange equations are satisfied:
Z
dF
µ
, where F = d3~xJ 0 .
0 = ∂µ J =⇒ 0 =
dt
Important: this derivation only works for global symmetries. If the symmetry was not global, we
would not be able to write ∂µ (x) in the above. (Well, if the symmetry is local, then it is also
global, so not such a big deal.) This derivation is the same in classical theories.
Important: the conservation of current ∂µ J µ is only true in the “ground state” of the system.
You can imagine it like this: suppose I have a stretchy ribbon resting on a flat plane, enclosing a
fixed area A; the action is I = σL, where L is the length of the ribbon. The ground state is when
the ribbon is a perfect circle. Even if it is not a perfect circle, the energy H is invariant under
a global rotation around the ẑ-axis by angle . If we allow the rotation angle = (θ), then the
total length might change (i.e. δI 6= 0). Here, the analog of J µ is basically r(θ). We see that in
the ground state, obviously
∂θ r(θ) = 0 because for a circle, r = const.
However, ∂θ r(θ) 6= 0 generally for non-circular shapes. So the current is only conserved when the
equations of motion hold.
From Wikipedia: “Noether’s theorem is an on-shell theorem: it relies on use of the equations
of motion - the classical path... The quantum analogs of Noether’s theorem probing off-shell
quantities as well are the Ward-Takahashi identities.”
Phew! At least we cleared that up now.
64
Jimmy Qin
Notes on Weinberg’s QFT
Question 110. When can we find, explicitly, the conserved charge F and the conserved current
J µ?
Answer 110. We can always find F when the Lagrangian L (and not just its time-integral, the
action I) is invariant under the symmetry. We can always find J µ when the Lagrangian density
L is invariant under the symmetry. The derivations are easy (pg. 308-9) and the results are
Z
∂L
δL
F.
F (t) = −i d3~x F (~x, t) and J µ = −i
∂(∂µ Ψ)
δ Ψ̇
Question 111. What does it mean for the conserved charge F to generate the symmetry Ψ →
Ψ + iF ?
Answer 111. Suppose the variable Ψ is, in fact, a canonical variable with conjugate Π. Then
, and the formula in the previous answer becomes
Π = δδL
Ψ̇
Z
F (t) = −i d3~xΠF (~x, t)
so we have
[F (t), Ψ(~x, t)] = −F (~x, t) =⇒ e−iF ΨeiF = Ψ + iF .
For example, the Hamiltonian generates translations in time. This derivation, of course, only works
when the answer to the above question is true: when
R the global symmetry leaves the Lagrangian
L invariant, in addition to leaving the action I = dtL invariant.
A nice fact to know (pg. 312-3) is that if the generators ta of the transforms of canonical fields
Qn (x) → Qn (x) + ia (ta )nm Qm (x)
c
c
satsify a Lie algebra [ta , tb ] = ifab
tc , then so do the conserved charges, [Ta , Tb ] = ifab
Tc .
7.4
Lorentz invariance of L implies Lorentz invariance of S-matrix
Question 112. Describe the logic of Weinberg’s argument that the Lorentz invariance of L
implies the Lorentz invariance of the S-matrix.
Answer 112. The first question in section 3.2 of these notes says that the S-matrix is Lorentzinvariant if
Sβα = hΦβ |SΦα i = hU0 (Λ, a)Φβ |SU0 (Λ, a)Φα i =⇒ [U0 (Λ, a), S] = 0.
where U0 implements the Lorentz transforms on the wavefunctions. We already know relativistic
quantum mechanics, so U0 is known (the generators are J0µν ). What is unknown is S = U (−∞, ∞),
which itself is made of generators J µν . Weinberg shows in (3.3.21) that this is satisfied if
~ 0 , V] = −[W
~ , H],
[K
~ are smooth functions of the energies. Here, K
~ 0 generates boosts in the
and matrix elements of W
~
~
~
~ 0 . The derivation is
free theory, K generates boosts in the interacting theory, and W = K − K
really complicated.
65
Jimmy Qin
Notes on Weinberg’s QFT
However, we now know that, for a particular theory L , the representation of the Lorentz group
which acts on the canonical variables is generated by the conserved charges J µν of Noether’s
theorem, as in the previous section. If we know that L is invariant, then the explicit forms of
the generators can be calculated, as shown in the previous section. This is the stress-energy
tensor. These generators are exactly the ones we are looking for, i.e. the ones we use to build
S = U (−∞, ∞).
Then we just have to show that the regular commutations of the Lorentz generators are satisfied,
along with the two crucial conditions above. It turns out that these are satisfied for nearly any
L which is Lorentz-invariant.
7.5
Examples of canonical quantization and transition to interaction
picture
Performing canonical quantization of a theory is kind of like solving a crossword puzzle or doing
sudoku, in the sense that the problem is essentially the same every time but you can always have
fun doing it.
Weinberg does three examples. The first is the scalar field with derivative coupling to a current.
That is pretty easy and his derivation is clear enough. The second is the vector spin one field,
which involves introducing an ad hoc condition to eliminate the scalar component. Let us work
through the third example: the Dirac field.
Keep in mind that the original Lagrangians are for interacting theories in the Heisenberg picture
where, for example, Ψ evolves with the interacting Hamiltonian. We want to write a decomposition H = H0 + V in the interaction picture where the fields evolve with the free Hamiltonian, H0 . Finally, we express the fields in creation and annihilation operators; we can find the
(anti)commutations of the creation and annihilation operators by enforcing the (anti)commutations
of the fields themselves. For this last step, I realized that in my final project for Statistical Physics
of Fields that you always need a classical equation of motion to express fields in a† and a. This is
also true here; the classical equation of motion comes from H0 and Hamilton’s equation.
We will also get to practice this technique in the problems.
Question 113. Carry out the canonical quantization, and transition to interaction picture, of the
Lagrangian density
L = −Ψ̄(∂/ + m)Ψ − U (Ψ̄, Ψ).
Answer 113. First, note that Ψ̄ has no time-derivative acting on it, so it has no conjugate variable,
as we mentioned before. The canonical conjugate of Ψ is
Π=
The Hamiltonian is
Z
H=
∂L
= −Ψ̄γ 0 .
∂ Ψ̇
d3~x(ΠΨ̇ − L ).
~ + m)Ψ,
We can rewrite the free part of the Lagrangian as L0 = Πγ 0 (∂/ + m)Ψ = ΠΨ̇ + Πγ 0 (~γ · ∇
so
H = H0 + V, where
66
Jimmy Qin
Notes on Weinberg’s QFT
Z
H0 =
~ + m)Ψ and V =
d ~xΠγ (~γ · ∇
3
0
Z
d3~xU (Ψ̄, Ψ).
Passing to the interaction picture, in this case, means replacing all uppercase with lowercase. The
classical equation of motion for ψ is given by Hamilton’s equation,
ψ̇ =
δH0
~ + m)ψ =⇒ (∂/ + m)ψ = 0.
= γ 0 (~γ · ∇
δπ
To express the field in raising and lowering operators, we need a classical equation of motion −
which we now have. We write ψ(x) as the expansion
Z
ψ(x) =
u(~pσ)eipx ap~σ + v(~pσ)e−ipx bp†~σ ,
p
~σ
where (ip/ + m)up~σ = (−ip/ + m)vp~σ = 0.
To enforce the canonical anticommutators {ψα (~x, t), πβ (~y , t)}(γ 0 )βγ = i(γ 0 )αγ δ(~x − ~y ) (see section
7.1), it turns out that we must have {ap~σ , ap†~0 σ0 } = {bp~σ , bp†~0 σ0 } = δ(~p − p~0 )δσ0 σ . This completes the
canonical quantization; the free Hamiltonian can be expressed
Z
p0 (ap†~σ ap~σ + bp†~σ bp~σ ),
H0 =
p
~σ
up to an arbitrary constant.
This was pretty systematic, but why was it that [ψ(~x, t), π(~y , t)] 6= iδ(~x − ~y )? This is what I said
in the opening paragraph of my notes on section 7.2. Namely, there is a difference between having
a conjugate friend and having a conjugate friend who is also canonical.
7.6
Problems
These problems are a bit hard.
Question 114. Consider the theory of a set of real scalar fields Ψn with Lagrangian density
L =−
1X
∂µ Ψn ∂ µ Ψm fnm (Ψ),
2 mn
where fnm is some non-singular matrix function of the fields; this is called the nonlinear sigmamodel. Carry out the canonical quantization of this theory and derive the interaction V [φ, φ̇] in
the interaction picture.
Answer 114. We find the conjugate momenta are
Πn =
X
∂L
=−
fnm (Ψ)∂0 Ψm .
n
∂ Ψ̇
m
This is an inspired guess, but it’s easy to check that
X
X
(f (Ψ)−1 )nm Πn Πm =
f (Ψ)ab ∂0 Ψa ∂0 Ψb .
nm
ab
67
Jimmy Qin
Notes on Weinberg’s QFT
Therefore, our Lagrangian can be written as
1X
[(f (Ψ)−1 )nm Πn Πm + fnm (Ψ)(∇Ψn ) · (∇Ψm )].
L =−
2 nm
Everything in this is in terms of the canonical variables (i.e. no Cr variables) so passage to the
interaction picture is accomplished by taking
Ψ → ψ and Π → π.
Also, I think the interaction is just what is left over from the kinetic term. So we would write
(interaction picture):
L = L0 + U , where
X
1
1X
L0 = −
[(f (ψ)−1 )nm πn πm + (∇ψ n ) · (∇ψ m )] and U = −
(fnm (ψ) − 1)(∇ψ n ) · (∇ψ m ).
2 nm
2 nm
Question 115. Consider the theory of a complex scalar field Φ and a real vector field V µ with
corresponding tensor Fµν = ∂µ Vν − ∂ν Vµ and covariant derivative Dµ = ∂µ − igVµ . Carry out the
canonical quantization of
m2
1
Vµ V µ .
L = −(Dµ Φ)† Dµ Φ − Fµν F µν −
4
2
Derive the interaction in the interaction picture.
Answer 115. Just as in the photonic case, the antisymmetry of Fµν means that V 0 is not a
canonical variable since its time-derivative does not appear in the Lagrangian. Therefore, the
canonical variables are V i and Φ and Φ† .
Let’s find the conjugate momenta; I will call them Wi and Π and Π† . Expanding the Lagrangian
density gives
m2
1
L = −(∂µ Φ† )(∂ µ Φ) + igVµ (Φ† ∂ µ Φ − Φ∂µ Φ† ) + g 2 Vµ V µ Φ† Φ − (∂µ Vν ∂ µ V ν − ∂ν Vµ ∂ µ V ν ) −
Vµ V µ .
2
2
I find that
Π = −∂0 Φ† + igV0 Φ† and Wi = ∂i V0 − ∂0 Vi =⇒ V 0 =
1 ~ ~
∇ · W.
m2
The last comes from Euler-Lagrange equation on V 0 .
R
~ · V~˙ − L ). First, we write
Now, let’s write H = d3~x(ΠΦ̇ + Π† Φ̇† + W
˙
~ 0+Π
~ =Π
~ − 1 ∇(∇ · Π).
~
V~ = −∇V
m2
8
Electrodynamics
Quantum electrodynamics - particularly the theory of the photon - is really the one part of QFT
that never made any sense. So, we will study and try to make sense of it now.
Weinberg also does some tree-level scattering calculations. These are the same as in any textbook
so I will not write notes on them.
68
Jimmy Qin
8.1
Notes on Weinberg’s QFT
Gauge invariance
Question 116. Why do we need gauge invariance?
Answer 116. In fact, we wouldn’t need gauge invariance in electrodynamics if nature operated
according to different laws than it does. But it seems that electrodynamics is mediated by a
~ (If the particle didn’t
massless spin-1 particle which couples to a conserved current, J µ = (ρ, J).
need to couple to a conserved current, we could just use L ⊃ Fµν F µν and get out of jail free.)
Given that we need to couple a massless spin-1 particle to a conserved current, the easiest way
to do so is to embed it in a vector field Aµ , since Aµ J µ is a Lorentz scalar. However, a massless
spin-1 particle cannot be embedded in a vector field without also requiring gauge invariance! The
reason for this is that the photon little group, ISO(2), and any irreducible representation of the
Lorentz group SO+ (1, 3) are incompatible (for example, we saw in section 5.8 of these notes that
the helicity of a massless particle in the (A, B) irrep must be σ = B − A). For example, Weinberg
(5.9.31) shows that under a general Lorentz transform,
U (Λ)aµ (x)U −1 (Λ) = Λµν aν (Λx) + ∂µ Ω(x, Λ)
where Ω(x, Λ) is some linear combination of creation and annihilation operators.
Because the vector representation ( 21 , 12 ) is an irrep, we cannot embed the photon in the vector
representation.
This is not true for massive particles. The reason is that the generators of the photon little group
ISO(2) mix rotations and boosts, but the generators of the massive little group SO(3) consist
purely of rotations.
Question 117. Describe how gauge invariance allows us to couple the photon to an external
current in a Lorentz-invariant way.
Answer 117. To ensure that the extra term in
U (Λ)aµ (x)U −1 (Λ) = Λµν aν (Λx) + ∂µ Ω(x, Λ)
has no effect, we require that the action be invariant under the gauge transformation
Aµ (x) → Aµ (x) + ∂µ (x)
for any (x). (Here, uppercase means interacting and lowercase means free.)
Now we will couple this to a conserved current by constructing a Lagrangian that forces the current
to be conserved, so to speak. The change in the action under a gauge transform is
Z
δI
δI
∂µ (x) =⇒ 0 = ∂µ (
).
0 = δI = d4 x
δAµ (x)
δAµ (x)
Because Noether’s theorem on continuous global symmetries implies conserved currents, we introduce a global U (1) symmetry into the action. Let Ψl (x) be all fields other than the photon field,
and consider the U (1) symmetry
δΨl (x) = iql Ψl (x),
69
Jimmy Qin
Notes on Weinberg’s QFT
which is generated by the charge Q =
scale the U (1) charges such that
R
d3~xJ 0 , where ∂µ J µ = 0 by Noether’s theorem. If we
Jµ =
δI
,
δAµ (x)
then we solve two problems (introduction of a conserved current and gauge-invariance) at once.
Everybody is happy!
Question 118. In the above question, we started with a global U (1) symmetry. However, QED
is based on a local U (1) symmetry. Where does this come from?
R 4 δI
into
0
=
δI
=
d x δAµ (x) ∂µ (x), we find that the
Answer 118. If we substitute J µ = δAδI
µ (x)
symmetry has become local, since now (x) doesn’t have to be a constant and the action is still
conserved.
. (If the U (1) symmetry was only
This is because of the nontrivial identification J µ = δAδI
µ (x)
global, we wouldn’t call QED a gauge theory.)
Question 119. What is the logically opposite way of arriving at the Aµ J µ coupling?
Answer 119. You can find this on Weinberg 342-43 and in Chapter 1 of Nakahara Geometry,
Topology, and Physics. Essentially we start with a global U (1) symmetry and promote it to a
local symmetry. This necessitates introducing a covariant derivative
Dµl = ∂µ − iql Aµ
to act on each field Ψl (x). The hypothetical mass term Aµ Aµ , of course, is ruled out as not
gauge-invariant, so that is why (in this logic) the photon field must be massless.
Question 120. How can I think of J µ ? For example, where is J µ in the usual Lagrangian for
QED,
/ − m)ψ?
L = ψ̄(iD
Answer 120. Recall that J µ is a current which describes the charge of the fields which transform
under local U (1) gauge symmetry. In this Lagrangian, J µ is just what it looks like: namely,
∂L
= ψ̄γ µ eψ (up to some constants). This reflects the fact that J µ comes from Noether’s
∂Aµ
theorem applied on the charged fields; in this case, the charged field is ψ.
Question 121. How does QED return Maxwell’s equations?
Answer 121. Everyone knows this one. We take the Lagrangian
1
L = − Fµν F µν + Aµ J µ + · · · .
4
The classical (i.e. on-shell) field equation for the photon field is
0 = ∂µ F µν + J ν .
~ = 4πρ, ∇ × B
~ = 4π J.
~ The other two
This gives the inhomogeneous Maxwell equations ∇ · E
Maxwell equations have nothing to do with charge, so they have nothing to do with J µ . Instead,
they arise from the definition of Fµν ,
0 = ∂µ Fν + ∂ Fµν + ∂ν Fµ .
This is called the Bianchi identity.
70
Jimmy Qin
8.2
Notes on Weinberg’s QFT
Difficulties with quantizing the photon field
There are multiple difficulties to surmount before we can quantize the photon field. They are
related to gauge-invariance, if you want to think about it that way.
Question 122. Describe why A0 (x), the time-component of the photon field, has no conjugate
variable. Explain how this is related to gauge invariance.
Answer 122. For L ∼ Fµν F µν , we find the canonical conjugate
Πµ =
∂L
∂(∂0 Aµ )
gives Π0 ∼ F 00 = 0, since F µν is antisymmetric. (In fact, we studied this in Chapter 7.) This is
called a primary constraint because it follows directly from the structure of L .
There is also a secondary constraint
∂i Πi = −∂i
∂L
∂L
=−
= −J 0 .
∂Fi0
∂A0
This follows from TODO.
This is related to gauge invariance because, given any solution Aµ (x) of the field equations, we
can always find another solution Aµ (x) + ∂µ (x) with the same value and time-derivative at t = 0
but which differs from Aµ (x) at later times. The df here is one (because (x) comes packaged with
1 df). This also suggests the solution: given that Aµ has some freedom due to gauge invariance,
we can remove the freedom (i.e. make Aµ (x) a deterministic object, which makes it possible to
quantize) by choosing a gauge.
Question 123. Describe with examples why choosing a gauge allows us to describe a photon
4-component field with only 3 propagating degrees of freedom.
Answer 123. We found in section 7.3 of these notes that we can explicitly find the conserved
charge, J 0 , in terms of the canonical variables. Well, A0 is not a canonical variable, as shown in
the previous question. The point here is that J 0 is known.
~ ·A
~ = 0. Then
Let us first choose the Coulomb gauge, ∇
~ ·A
~ = 0 and − ∂i F i0 = J 0 =⇒ −∇2 A0 = J 0
∇
and because J 0 is known, so is A0 (x).
Or we can choose Lorentz (Landau) gauge ∂µ Aµ = 0. The solution for A0 (x) would look
different but the df are the same as Coulomb gauge, so there is also a deterministic solution for A0
in terms of all the canonical variables in the theory (even those which are for other fields besides
the photon field, since they enter in A0 through J 0 ).
Question 124. Describe the quantization of QED. Specifically, outline the important points in
Weinberg’s (quite difficult) derivation on pages 346-50.
Answer 124. Because of the constraints
∂i Ai = 0 and ∂i Πi + J 0 = 0,
71
Jimmy Qin
Notes on Weinberg’s QFT
the variables Ai (x) are still not free of constraints and hence there is difficulty in quantizing them,
even though we have already eliminated the A0 (x) component by choosing the Coulomb gauge. It
turns out that the equal-time commutators are modified to (8.3.5.):
[Ai (~x), Πj (~y )] = iδji δ(~x − ~y ) + i∂j ∂i (
1
).
4π|~x − ~y |
The derivation of this is really hard. The end result is that
1~ 2
1
~ 2
~ ~ 1 0 0
H= Π
⊥ + (∇ × A) − J · A + J A + HM ,
2
2
2
where HM is the rest of the Hamiltonian for the other fields (which may be coupled to the photon
field through the A0 term), and
~⊥ = Π
~ − ∇A
~ 0
Π
is chosen because it commutes with HM , basically.
8.3
QED in the interaction picture; photon propagator
Let me simply summarize the passage to the interaction picture. Then we will study the photon
propagator in detail.
We start with the Hamiltonian from the previous question and split it into quadratic terms plus
the interaction,
H = H0 + V, where
1~ 2
1
~ 2
~ ~ 1 0 0
H0 = Π
⊥ + (∇ × A) + HM 0 and V = −J · A + J A + VM .
2
2
2
In this way, we can use the quadratic free Hamiltonian to talk about a “photon propagator.” Then
we pass to the interaction picture by applying the similarity transformation described in section
7.5 of these notes. The regular Hamilton equations of motion for the conjugate pair ~a, ~π gives the
wave equation
~a = 0.
Now there is a subtlety. “Since A0 is not an independent Heisenberg-picture field variable, but
rather a functional of the matter fields and their canonical conjugates that vanishes in the limit
of zero charges, we do not introduce any corresponding operator a0 in the interaction picture, but
rather take a0 = 0.” We can use the wave equation to second-quantize the field a(x); it turns
out that the weird commutations of the previous question (in the Heisenberg picture) are satisfied
only if the usual commutations of creation and annihilation operators in the interaction picture
are satisfied.
Conclusion: We can eliminate the time-component of the photon field in the interaction picture
but not in the Heisenberg picture. We find that quantizing the field is easy after we have gone to
interaction picture (upon reflection, in section 7.5, we also quantized after we went to interaction
picture).
Question 125. Describe the calculation of the photon propagator.
72
Jimmy Qin
Notes on Weinberg’s QFT
Answer 125. First, recall from above that we could set a0 = 0 in the interaction picture. This
means that e0 (~pσ) = 0, so there are only three basis vectors, ei (~pσ). We can normalize them such
that
X
Pµν =
ei (~pσ)ej (~pσ)∗ = δij − pi pj /|~p|2 .
σ
As in section 6.2 of these notes, we define the propagator
−i∆µν (x − y) = hT̂ [aµ (x), aν (y)]i
where the expectation is a vev (obviously, because we are working with free fields and there is no
background). We find that
Z
1
Pµν (~q)
∆µν (x − y) =
.
d4 qeiq(x−y) 2
4
(2π)
q − i
Question 126. Describe the subtlety in the photon propagator. What did Feynman get wrong?
Answer 126. Feynman’s pioneering paper which established the fundamentals of QED is PR 76
769 (1949). Unfortunately, he got something wrong about the gauge invariance of the photon
propagator, which was noticed by Bialynicki-Birula in PR 155 1414 (1967).
This is described (well, a related phenomenon is described) on pages 354-355 of Weinberg. The
punchline is that the photon propagator is not gauge-invariant on mass shell, but that this is
cancelled by another non-gauge-invariant effect in renormalization. The easiest way to prove this
is through the path-integral formalism.
The result ((8.5.9) in Weinberg) is that we can effectively take the photon propagator to be the
covariant quantity
ηµν
−i
4
2
(2π) q − i
in momentum-space. This is nice because it guarantees our matrix elements will be Lorentz scalars
(which they might not be if we used a different convention for the photon propagator).
Question 127. Describe numerically the effect of adding more loops to a Feynman diagram. In
other words, how does the numerical value of the matrix element for a given process scale with
the number of loops?
Answer 127. This is from page 358. A connected diagram with V vertices, I internal lines, E
external lines, and L loops satisfies the graph theory relations
L = I − V + 1 and 2I + E = 3V.
There is a factor e(2π)4 from each vertex, (2π)−4 from each internal line, and a 4D momentumspace integral from each loop, which turns out to contribute π 2 . (This is because TODO.)
Therefore, the diagram scales like
(2π)4 eE−2 (
α
e2 L
) = (2π)4 eE−2 ( )L .
2
16π
4π
The important factor is α/4π = 5.81 × 10−4 , which is dimensionless. Because this is small, manyloop diagrams have small contributions.
73
Jimmy Qin
8.4
Notes on Weinberg’s QFT
p-form gauge fields
The strength tensor Fµν of QED can be generalized to arbitrary tensors. Let us consider an
arbitrary p-form tµ1 µ2 ···µp with exterior derivative dt. A p-form whose exterior derivative vanishes
is called closed and a form which itself is an exterior derivative is called exact. Since d2 = 0, any
exact form is also closed. Poincare proved that in a simply connected region, any closed form is
~ = −∂t E,
~ ∇·B
~ = 0 tell us
also exact. (For example, the homogeneous Maxwell equations ∇ × B
that Fµν is closed. Since R4 is simply connected, it must also be exact, so there exists some Aµ
such that F = dA.)
Question 128. How can we generalize the 1-form Aµ gauge theory to a gauge theory of general
p-forms, in D dimensions?
Answer 128. Consider the generalized gauge transform
δA = dΩ =⇒ δAµ1 ···µp = ∂[µ1 Ωµ2 ···µp ,
where Ω is an arbitrary (p − 1)-form. Construct the field strength tensor
F = dA
and also the Lagrangian density
L =−
1
Fµ1 ···µp+1 F µ1 ···µp+1 + J µ1 ···µp Aµ1 ···µp , where ∂µ1 J µ1 ···µp = 0.
2(p + 1)
This Noether theorem guarantees the Lagrangian (integral of the Lagrangian density) is invariant
under a gauge transform, i.e. via integration by parts. Clearly we must have p + 1 ≤ D due to
antisymmetry of F .
Question 129. Prove this duality theorem:
In D dimensions, the theory of a p-form gauge field A is equivalent to the theory of a
(D − p − 2)-form gauge field φ.
Answer 129. Unsurprisingly, we will use the duality transforms
F µ1 ···µp+1 = µ1 ···µD Fµp+2 ···µD and J µ1 ···µp = µ1 ···µD Jµp+1 ···µD .
The field equation and Noether conservations
∂µ F µµ1 ···µp = −J µ1 ···µp and ∂µ1 J µ1 ···µp = 0 become
dF = J and dJ = 0.
This means that J is a closed form; therefore it can be written as J = dG =⇒ dF = dG .
Thus, Poincare tells us that
F = G + dφ
where φ is a (D − p − 2)-form. We still have not used one piece of information: the field equation
dF = 0 =⇒ ∂µ1 F µ1 ···µD−p−1 = 0 thus becomes
∂µ1 (dφ)µ1 ···µD−p−1 = ∂µ1 G µ1 ···µD−p−1 .
This is invariant under the new gauge transform φ → φ + dω. In the case where D = p − 2, dω is
replaced by a scalar (and all the terms in the Lagrangian would be constructed out of derivatives
only).
74
Jimmy Qin
8.5
Notes on Weinberg’s QFT
Problems
Question 130. Calculate the differential and total cross-sections for the process
e+ + e− → µ+ + µ−
to lowest order in e. Assume that spins are not observed and use the simplest possible Lagrangian.
Answer 130. This is computed in my notes on Schwartz Quantum Field Theory. I feel that
Schwartz’ Feynman rule conventions are a little cleaner and more standard than Weinberg’s, so I
won’t bother to do it again.
The result for the differential scattering cross-section turned out to be
(
dσ
α2 |~p| 4
(E + k 2 p2 cos2 θ + E 2 (m2e + m2µ )).
)CM =
6
~
dΩ
16E |k|
Question 131. Write a gauge-invariant Lagrangian for a charged massive vector field interacting
with the electromagnetic field.
Answer 131. The fact that the vector field is charged means that it also transforms (i.e. local
phase) under the gauge transform. A possible Lagrangian density is
1
1
1
1
L = (Dµ V ν )† (Dµ Vν ) + (Dµ V µ )† (Dν Vν ) − m2 (V µ )† Vµ + Fµν F µν .
2
2
2
4
9
Path Integrals
We saw in the previous chapter that the photon propagator has some gauge freedom which is
“cancelled out,” so to speak, by renormalization. Instead of doing this messy stuff where we
have to choose a covariant propagator, we would like a formalism in which Lorentz covariance
is manifest in all the Feynman rules the moment we derive them. This is the path integral
formalism.
Feynman rules can therefore be derived from either the canonical formalism (Schwinger’s formalism) or from the path integral (Feynman’s formalism). However, the path integral is vastly
superior for more complicated theories, such as non-Abelian gauge theories, in which the gauge
freedom is even harder to handle than in QED. It is also better for theories such as the nonlinear
σ-model in which the Feynman rules from the canonical formalism are simply wrong! Weinberg
notes a “conservation of trouble,” in the sense that the canonical formalism makes unitarity manifest but not Lorentz invariance, but the path integral makes Lorentz invariance manifest but
obscures unitarity.
9.1
Hamiltonian version of path-integral formula; S-matrix
The derivation of the Hamiltonian version of the path-integral formula follows from inserting
†
lots of intermediate resolutions of the identity, exploiting the coherent states |λi = eλa |0i. See
75
Jimmy Qin
Notes on Weinberg’s QFT
Coleman Many-Body Physics for a detailed and similar derivation in condensed matter theory.
Let
F = OA (P (tA ), Q(tA )), OB (P (tB ), Q(tB )), · · · .
The result is (x = (t, ~x)):
0 0
hq t |T̂ [F ]|q0 t0 i =
Z Y
dqa (x)
x,a
Y dpb (x)
x,a
2π
i
R t0
F e t0
dt
P
a q̇a (t)pa (t)−H(q,p)
.
We used the convention that all qs are to the left of all ps. You can also choose another convention.
Question 132. Why is the above form of the path integral not suitable for calculating S-matrix
elements?
Answer 132. Easy: S-matrix elements are between particle states |αi and |βi, not between
coherent states (which are superpositions of an infinite number of particle states). Inserting the
completion
Z
dpb (~x, ∞)
hβ|q 0 , ∞ihq 0 , ∞|
hβ| = dqa (~x, ∞)
2π
at t = ∞ and a similar completion at t = −∞ gives
Z Y
R∞ 4 P
Y dpb (x)
hβ|T̂ [F ]|αi =
dqa (x)
F ei −∞ d x a q̇a (x)pa (x)−H(q,p) hβ|q 0 , ∞ihq, −∞|αi.
2π
x,a
x,a
The big quantity is no longer inside a matrix element because we have done an unconstrained
integral over all x = (~x, t). That way, it basically became kind of inside a trace, which means that
the outside vectors become an identity matrix, so we can just ignore it.
In fact, it turns out that the last part, the wavefunctions
hβ|q 0 , ∞ihq, −∞|αi,
gives the i in every propagator in the Feynman rules. The derivation is on Weinberg pg. 386-8
but is quite difficult. The final result is
2
hβ|T̂ [F ]|αi = |N |
Z Y
dqa (x)
x,a
Y dpb (x)
x,a
2π
R∞
4
F ei −∞ d x
P
a q̇a (x)pa (x)−H(q,p)+(i terms)
.
The normalization factors (such as det1 A and such) are unimportant because they also contribute
to the matrix element hβ|αi. Any constant factors divide out (or you can say they are cancelled
out in the denominator of the Gell-Mann - Low theorem by exponentiation of the disconnected
diagrams).
9.2
Lagrangian version of path-integral formula
The integrand in the exponentiated part of the previous formula,
X
q̇a (x)pa (x) − H(q, p),
a
76
Jimmy Qin
Notes on Weinberg’s QFT
looks quite a bit like the Lagrangian. This is not exactly true because Hamilton’s equations (or
their equivalents, the Euler-Lagrange equations) no longer hold in the path-integral formalism
when things are manifestly allowed to go way off-shell. Therefore, there is no “minimization” to
perform and hence no true notion of a Legendre transform.
However, it turns out that we can, in fact, pass to the Lagrangian form if the Hamiltonian is
quadratic in the canonical momentum variables (see pg. 389-90). This is kind of what Schwartz
does when he derives the path integral; essentially he does something like
m 2
ẋ − H → L
2
for nonrelativistic QM and postulates that the same thing works in relativistic QFT.
Because this receives nontrivial modifications for certain kinds of important theories, such as the
nonlinear sigma model, we will study it in detail here.
Question 133. Outline how, if the Hamiltonian is quadratic in the canonical momentum variables
Z
XZ
1X
B~xn [Q]Pn (~x) + C[Q],
H[Q, P ] =
A~xn,~ym [Q]Pn (~x)Pm (~y ) +
2 nm ~x~y
~
x
n
where A is real, symmetric, positive and nonsingular, we can pass from the Hamiltonian path
integral to the Lagrangian path integral.
Answer 133. If we substitute this into the Hamiltonian version of the path integral and integrate
out the momenta (which we can do because the integral is Gaussian), then we get two things:
• First, we get the factor (det(2πiA [Q]))−1/2 , where A [Q] = A[Q]δ(t − t0 ). This will give
nontrivial Feynman rules if the matrix A[Q] is a nontrivial function of Q.
• Second, we get the Lagrangian in the exponential. How? The Gaussian integral is proportional to the “exponential evaluated at the stationary point of its argument.” But the
stationary point for the canonical momentum is merely when Hamilton’s equation
q̇n (~x, t) = δHpn (~x, t)
is satisfied! Therefore, we can interpret
X
q̇a (x)pa (x) − H(q, p)
a
as a true Legendre transform after we integrate out the canonical momenta, because doing
so automatically enforces Hamilton’s equation.
To find expectation values, we have to assume the O are independent of the canonical momenta.
This can be achieved even if O contains time-derivatives of the canonical fields by expressing them
as O[Φ(t + dt), Φ(t)] instead of O[Π(t)]. A little weird, admittedly.
Question 134. The above only works when the Hamiltonian is quadratic in the canonical momenta. Why is this a reasonable assumption for many theories?
77
Jimmy Qin
Notes on Weinberg’s QFT
Answer 134. Lots of theories look roughly like
H = Hkin + V(Φ),
where the kinetic term is obviously quadratic in the momenta and the interaction has no momentadependence at all.
Question 135. Describe the Feynman rules which could arise from the (det(2πiA [Q]))−1/2 factor
after integrating out the momenta.
Answer 135. Let us take the nonlinear sigma model
L =−
1X
∂µ Φn ∂ µ Φm (δnm + Unm (Φ)) − V (Φ).
2 nm
Let Πn be the canonically conjugate momentum to Φn . It’s not hard to calculate
Z
1
1
H = d3~x( Πn (1 + U (Φ))−1
nm Πm + ∇Φn · ∇Φm (1 + U (Φ))nm + V (Φ)),
2
2
and obviously we find
(4)
A = (1 + U (Φ))−1
nm δ (x − y).
If demote the δ-function to a discrete δ-function in a lattice spacetime, where each point on the
lattice has small 4-dimensional volume Ω, i.e. δ (4) (x − y) → Ω−1 δxy , then we can calculate
Z
1
d4 xtr ln(1 + U (Φ(x))].
det A ∝ exp[−
Ω
The result is that there is a correction of
i
∆L = − Ω−1 tr ln(1 + U (Φ(x))
2
to the Lagrangian density.
I think this is so interesting. It seems that in this case, the path-integral correction to the
Lagrangian density is dependent in scale, because the lattice spacing depends on our energy scale.
The larger the energy, the smaller the lattice spacing and the more important this correction
becomes.
R
Compare this to the typical case, when we have H ⊃ d3~x 21 Π2n and therefore
A = δ (4) (x − x0 ).
Replacing (1 + U )−1 with 1 gives the correction
i
∆L = Ω−1 tr ln(1) = 0.
2
78
Jimmy Qin
9.3
Notes on Weinberg’s QFT
Path-integral derivation of Feynman rules
Let us use the path-integral formalism to calculate quantities which look like
MlA lB ··· (xA xB · · · ) =
hΩ, ∞|T̂ [ΨlA (xA )ΨlB (xB ) · · · ]|Ω, −∞i
.
hΩ, ∞|Ω, −∞i
The bottom is just a phase factor, because of slow adiabatic evolution, etc. You know the drill by
now. These vacuum expectations values are the ones we are interested in because TODO.
Question 136. Describe how to derive the Feynman rules from the path integral, given that that
the Hamiltonian is quadratic in the canonical momenta.
Answer 136. Obviously we have
R
MlA lB ··· (xA xB · · · ) =
and we will split I =
R
D[ψ]ψlA (xA )ψlB (xB ) · · · eiI[ψ]
R
,
D[ψ]eiI[ψ]
d4 xL into the terms
I = I0 + I1 , where
Z
I0 =
Z
4
d xL0 (ψ, ∂µ ψ) + (i terms) and I1 =
d4 xL1 (ψ, ∂µ ψ).
Now, what follows is very similar to the derivation of Feynman rules in Kardar’s Statistical Physics
of Fields because Kardar also works in the path-integral formalism. To make use of Wick’s theorem
(which will allow us to use simple propagators), we expand in the interaction but keep the large
part of the Lagrangian (or action) intact, because that is quadratic and hence gives a Gaussian
distribution:
∞
X
iN
iI[ψ]
iI0 [ψ]
e
=e
(I1 [ψ])N .
N!
N =0
The question now is how to use Wick’s theorem. We assume L0 is quadratic in the fields and can
be written
Z
X
1
Dlx,l0 x0 ψl (x)ψl0 (x0 )
I0 = −
d4 xd4 x0
2
ll0
where the matrix D can contain derivatives as well, of course. For example, the simple scalar
Lagrangian L0 = − 21 ∂µ φ∂ µ φ − 21 m2 φ2 has
Dxx0 = ∂µ ∂µ0 δ(x − x0 ) + m2 δ(x − x0 ) − iE(~x, ~x0 )δ(t − t0 ).
Z
=
d4 p ip(x−x0 ) 2
e
(p + m2 − iE(~p)),
4
(2π)
if we assume translational invariance. The i term is a little complicated and I didn’t really bother
to understand how he got it. In fact, because is infinitesimally small, we can just replace iE
with i. But anyway, you get the point. The good thing about this method is that this makes the
propagator extraordinarily easy to see. You can basically see the propagator from the previous
equation already!
79
Jimmy Qin
Notes on Weinberg’s QFT
Wick’s theorem gives us a product of paired expectation values, like hφl1 (x)φl2 (y)i. We can
calculate this using Gaussian integration and the result of course is just −i∆l1 l2 (x, y), where
∆l1 l2 (x, y) = (D −1 )l1 x1 ,l2 x2 .
Here, the −i came from both the eiI and the way D is defined with respect to I. Okay, from above
we merely have
Z
d4 p ip(x−y) 2
e
(p + m2 − iE(~p))−1 .
∆(x, y) =
(2π)4
The function of the i is to make the inverse well-defined for all real values of p.
Another important example: let’s study the photon propagator. We will see in a later section that
we can introduce another term into the Lagrangian, such that
Z
Z
1
α
1
4
µν
µ 2
I0 = d x(− fµν f − (∂µ a ) + (i terms)) = −
d4 xd4 x0 Dµx,νx0 aµ (x)aν (x0 ).
4
2
2
Here,
Dµx,νy = (ηµν ∂ρ ∂ −(1−α)∂µ ∂ν )δ(x−y)+(i terms) =
0ρ
Z
d4 q 2
(q ηµν −(1−α)qµ qν −iηµν )eiq(x−y) .
4
(2π)
Now, all we have to do is invert a 4 × 4 matrix. The result is
Z
η µν
1 − α qµqν
d4 q
∆µx,νy =
(
+
)eiq(x−y) .
(2π)4 q 2 − i
α (q 2 − i)2
This is fully covariant.
Question 137. Why are the propagators derived from path integration always manifestly covariant?
Answer 137. This question is not fully-formed. In fact, the propagators derived from path
integration are manifestly covariant only if the action I0 that gave birth to them was manifestly
Lorentz-invariant. This can be guaranteed for most theories of interest by integrating in or out
certain fields, for example.
A better question to ask may be why the propagators derived from the canonical formalism are not
always manifestly covariant. This is because, as we saw in section 8.3 of these notes, we derived
the photon propagator after choosing a gauge (and quantizing the field in this gauge). When we
made the convenient choice A0 (x) = 0 in the interaction picture, for example, this is completely
not Lorentz-covariant because of course Aµ doesn’t obey the regular transformation laws if A0 is
always zero.
When we derive things in the path integral, there is no explicit gauge choice to spoil Lorentz
covariance of the propagator. As shown above, we make the gauge choice after we derive a general
form of the propagator.
9.4
Path-integral formulation of QED
The point of this section is essentially to show how the term
α
(∂µ aµ )2
2
80
Jimmy Qin
Notes on Weinberg’s QFT
in the Lagrangian density (see the previous section, where we found the photon propagator) arises.
The logic is kind of strange, but the math is not very hard. I will just summarize the main points.
First, we start with the (not manifestly Lorentz-invariant) QED Hamiltonian in Coulomb gauge
~ = 0 (see section 8.2 of these notes):
∇·A
1~ 2
1
~ 2
~ ~ 1 0 0
H= Π
⊥ + (∇ × A) − J · A + J A + HM ,
2
2
2
where we can ignore A0 in the interaction picture.
1. First we integrate out the conjugate field Π, which is easy to do because the Hamiltonian is
quadratic in Π. Roughly speaking, the result is
Z
Z
Z
Y
1 ~˙ 2 1
2
4
~
~
~
~
Z = D[a, ψ] exp[i d x( (A) − (∇× A) + J · A+LM )−i dtVCoulomb ][ δ(∇· A(x))].
2
2
x
Here, the last factor enforces the Coulomb gauge.
2. Second, we make the argument of the exponential manifestly Lorentz invariant by integrating
in the A0 component of the field. Essentially how this works is that the stationary point of
Z
1
d4 x(−A0 J 0 + (∇A0 (x))2
2
R
turns out to be exactly the Coulomb action dtVCoulomb , so it may be reasonable that the
Coulomb action is the low-energy limit of the action of a dynamical field A0 (x). (This is
what Girma Hailu means when he says “integrating in.”) Once this is done, we find that
the action is manifestly Lorentz and gauge-invariant,
Z
1
I[A, ψ] = d4 x(− Fµν F µν + Aµ Jµ + LM ) + (i terms).
4
3. Third, we get rid of the final product of delta functions which enforces Coulomb gauge.
Think about it: this product of delta functions is equivalent to a gauge choice, so getting
rid of it is equivalent to re-introducing gauge freedom, which turns out to be equivalent to
introducing the term α2 (∂µ aµ )2 with arbitrary α into the Lagrangian density. This makes
perfect sense.
The last part of this derivation is on pg. 416-417. It is kind of creative and complex.
9.5
Problems
Question 138. Consider a nonrelativistic particle of mass m in a 1D potential V (x) = m2 ω 2 x2 .
Use path-integral methods to find the probability that if the particle is at x1 at time t1 , then it is
between x and x + dx at time t.
Answer 138. This is a well-documented example of a pain in the ass (see, for example, notes for
Physics 251b). The main difference between this Lagrangian and the Lagrangians we encounter
in QFT is that here, the Lagrangian is not translationally-invariant. It can be written
Z
m
1 2 1
L = dxψ † (x, t)(− ∂t2 −
∂x − mω 2 x2 )ψ(x, t).
2
2m
2
81
Jimmy Qin
Notes on Weinberg’s QFT
This is a little weird compared to the Lagrangians we’re used to from relativistic QFT (imagine
if we had an interaction which looked like L ⊃ x3 φ(x)4 ). The propagator is
Z x0
R t0
0 0
D[x(t)]ei t dtL[x(t)] ,
hx t |xti =
x
and we would like to know how to compute this. Since the Lagrangian is quadratic, the integral is
Gaussian, and up to a constant from the determinant we may set the argument of the exponential
equal to the classical value from the classical equation of motion. If the boundary conditions are
(x, t) → (x0 , t0 ), then the classical path is known,
xcl (t00 ) =
1
(x sin(ω(t0 − t00 )) − x0 sin(ω(t00 − t))).
sin ω(t0 − t)
Here, t00 is an intermediate time between t and t0 . Then you can obtain the classical action just
by substitution into L and integrating over t00 :
Scl (x0 t0 , xt) =
mω
((x2 + x02 ) cos(ω(t0 − t)) − 2xx0 ).
2 sin(ω(t0 − t))
This means that the propagator is
imω
hx0 t0 |xti = N (t, t0 )e 2 sin(ω(t0 −t))
((x2 +x02 ) cos(ω(t0 −t))−2xx0 )
,
where N (t, t0 ) is some normalization constant to preserve the number of particles in the wavefunction. It is independent of x, x0 because the size of fluctuations are independent of x (see 251b
notes for more detail). I won’t do the math, but this can be done by assuming |xti is already
normalized as 1 = hxt|xti and assuming it evolves into the wavefunction |ψ(t0 )i at time t0 , where
Z
0
|ψ(t )i = dx0 hx0 t0 |xti|x0 t0 i.
Then we enforce
0
0
1 = hψ(t )|ψ(t )i =
Z
dx0 dx00 hx0 t0 |xtihxt|x00 t0 ihx00 t0 |x0 t0 i ∝ N (t, t0 )3 .
The integrations over x0 and x00 are both Gaussian so this is doable. Then we will know N (t, t0 ),
at least up to a phase. Then the probability the particle is between x and x + dx at time t0 is
merely
P ((x1 , t1 ) → ([x, x + dx], t)) = |hxt|ψ(t)i|2 dx, where
Z
hxt|ψ(t)i = dx2 hx2 t|x1 t1 ihxt|x2 ti.
There is still an integral over x2 to do. This integration is how the different paths are of varying
likelihood - if the standard deviation of the Gaussian integral is high, then the probability of
landing in [x, x + dx] is also high.
Question 139. The Lagrangian density of the free spin- 23 Rarita-Schwinger field, ψ µ , is
1
1
L = −ψ̄ µ (∂/ + m)ψµ − ψ̄ µ (γµ ∂ν + γν ∂µ )ψ ν + ψ̄ µ γµ (∂/ − m)γ ν ψν .
3
3
Use path-integral methods to find the propagator of this field.
82
Jimmy Qin
Notes on Weinberg’s QFT
Answer 139. Although not mentioned in these notes, Weinberg shows in (9.5.50) that the freeparticle term for fermionic path integrals is actually bilinear in the fields and their conjugate
momenta,
Z
X
Dlx,l0 x0 pl (x)ql0 (x0 )
I0 = − d4 xd4 x0
ll0
(clearly this must be true: if it was quadratic in a Grassmann variable, the term would just vanish).
It seems to be that, instead of going to the trouble of finding the conjugate momenta, we can just
assume that we integrate over Dψ̄ and Dψ. This is maybe not so rigorous, but see https:
//arxiv.org/pdf/hep-th/0404131.pdf and https://aip.scitation.org/doi/abs/10.1063/
1.5064003.
So, we just write
L = ψ̄µ Λµν ψν
and try to invert Λ, which is a 4 × 4 matrix. I suppose you can do it with mathematica. The
answer apparently (copied from second link above) is
∆µν (p) =
P µν (p)
, where
p2 + m2 − i
1
2 µ ν
i
P µν (p) = (−ip/ + m)(g µν − γ µ γ ν +
p p +
(γ µ pν − γ ν pµ ))β.
2
3
3m
3m
In the massless case, https://en.wikipedia.org/wiki/RaritaSchwinger_equation says that
the Rarita-Schwinger Lagrangian has a gauge symmetry, which would mean some kind of gauge
symmetry in the propagator, kind of like the gauge symmetry in the photon propagator. That
would make this propagator blow up and we would have to integrate in another field, or something.
In fact, this is easy to see because there are lots of m1 terms in the massive Rarita-Schwinger
propagator, just like in the massive spin-1 boson propagator.
10
Non-perturbative methods
We now study higher-order, or loop, processes. In this chapter we will concern ourselves with
deriving general results which are valid to all orders in perturbation theory, or could be manifestly
nonperturbative (the difference is that “to all orders in perturbation theory” means something is
true for all diagrams; “manifestly nonperturbative” means something can be proved true without
drawing any diagrams). We will not focus on regularizing integrals, because we would like to
conceptually separate the physics of higher-order processes from the mathematics of making them
convergent.
We will use the following theorem extensively: The sum of all diagrams between initial state |αi
and final state |βi with extra vertices inserted, corresponding to operators oa (x), ob (y), etc. is
given by
hΨβ |T̂ {−iOa (x), −iOb (y), · · · }|Ψα i.
There are factors of −i due to the eiL = eipq̇−H , for example. Often, we take the operators to be
the fields themselves. This is the perturbative expansion of e−iHint .
83
Jimmy Qin
10.1
Notes on Weinberg’s QFT
Symmetries
Question 140. What is meant when we say a particular matrix element, or a particular Feynman
diagram, preserves a symmetry?
Answer 140. Let us think about global symmetries first, since that allows us to use Noether’s
theorem. If a theory is invariant under a global symmetry, Noether says there is a conserved
charge, call it q. The matrix element or diagram conserves the charge, or equivalently, preserves
the symmetry, if
hΨβ |T̂ {−iOa (x), −iOb (y), · · · }|Ψα i vanishes unless qα − qβ = qa + qb + · · · .
Actually, this depends on the sign conventions. The above identity holds for
Q|Ψα i = qα and [Q, Oa (x)] = −qa Oa (x).
The proof is very easy:
hΨβ |QT̂ {−iOa (x), −iOb (y), · · · }|Ψα i − hΨβ |T̂ {−iOa (x), −iOb (y), · · · }Q|Ψα i
= hΨβ |[Q, T̂ {−iOa (x), −iOb (y), · · · }]|Ψα i.
Now, let’s think about discrete symmetries, like PCT; we will meet Furry’s theorem later in this
section. Essentially what this means is that the diagram (or sum of diagrams) cannot contribute
unless it the result is invariant under, say, C.
Finally, how about gauge symmetries? The matrix element (or sum of Feynman diagrams) is
preserved under the gauge symmetry if it doesn’t change when we shift the photon field Aµ (x) →
Aµ (x) + ∂µ (x). From our study of QED, we know that this is equivalent to saying that ∂µ J µ = 0,
where the photon field is coupled to this current, Aµ J µ . The generalization of ∂µ J µ = 0 to general
matrix elements is what could be called the generalized Ward identity,
0
0
µµ ···
µµ ···
qµ Mβα
(q, q 0 , · · · ) = qµ0 0 Mβα
(q, q 0 , · · · ) = · · · = 0,
where
µµ0 ···
(q, q 0 , · · · ) =
Mβα
Z
0 0
0
e−iqx e−iq x · · · hΨβ |T̂ {J µ (x), J µ (x0 ), · · · }|Ψα i.
x,x0 ,···
Question 141. Describe how a matrix element is invariant under translational symmetry.
Answer 141. Translational symmetry is a global symmetry, so we use the same idea as in the
question above. The symmetry operator is Pµ and we have [Pµ , O(x)] = i∂µ O(x), and we can
choose |αi and |βi to be eigenstates of the momentum. If the big matrix element is called
M = hΨβ |T̂ {−iOa (x1 ), −iOb (x2 ), · · · }|Ψα i,
then we have
(pβµ − pαµ )M = i(∂1µ + ∂2µ + · · · )M.
The solution is
M = ei(pα −pβ )x F,
84
Jimmy Qin
Notes on Weinberg’s QFT
where F P
depends only on the relative distances between xi and not the center coordinate x =
P
c
x
,
i i i
i ci = 1 (any choice of c’s is fine). Taking M to Fourier space, M̃ , gives something like
Z
M̃ =
e−ik1 x1 −ik2 x2 −···+i(pα −pβ )x F
x1 x2 ···
which enforces conservation of momentum!
This is interpreted as: the sum of Feynman graphs conserves four-momentum, order-by-order.
This is not very interesting because every vertex of a Feynman graph conserves four-momentum,
so obviously the entire graph conserves four-momentum. So in this case, we can give a stronger
condition: every Feynman graph conserves four-momentum, and not just their sum.
Question 142. Prove Furry’s theorem,
The sum of all Feynman graphs with an odd number of external photons (on- or off-shell) and no
other external lines vanishes identically.
Describe why this theorem is true only for matrix elements, or sums of Feynman diagrams, rather
than for individual Feynman diagrams.
Answer 142. Furry’s theorem is a consequence of C-invariance of the matrix element. The Coperator flips the charge but not the momentum or spin. If ξ is a phase, then
Cap~σe C−1 = ξ ∗ ap~σ,−e and Cap~σ,−e C−1 = ξap~σe .
The Lagrangian is made of Dirac spinors, not these annihilation operators. The action is
CψC−1 = −ξ ∗ βC ψ ∗ .
Here, βC is a 4 × 4 matrix. The important thing is that
C(ψ̄γ µ ψ)C−1 = −ψ̄γ µ ψ =⇒ CAµ C−1 = −Aµ .
Otherwise the Lagrangian coupling between current and photon field would not be C-invariant.
Because Aµ is odd under charge, a Feynman graph with an odd number of external photon lines
and no other external lines is also odd. Therefore, it vanishes. For example, the scattering of a
photon by an external electromagnetic field receives no contributions of first order (or any odd
order) in the external field.
This does not hold for individual Feynman diagrams because vertices and propagators and things
do not really “conserve charge-inversion,” like they conserve momentum.
10.2
Polology
Question 143. What is polology?
Answer 143. We would like to know where the poles are in the momentum-space amplitudes of
matrix elements. Because the poles give the biggest contributions, it may help us compute things,
perform contour integrals, as well as think about the on-shell equations of motion, etc.
85
Jimmy Qin
Notes on Weinberg’s QFT
In this next question, we prove the main result of this section. Denote
Z
G(q1 , · · · , qn ) =
e−iq1 x1 ···−iqn xn hT̂ {A1 (x1 ) · · · An (xn )}i0 ,
x1 ,··· ,xn
where Ai are Heisenberg-picture operators and hOi0 is the expectation value in true vacuum.
Ai may be arbitrary local functions of fields and field derivatives. Contrary to the hΨβ |O|Ψα i
structure we were studying in the last section, here there are no external particle fields, just the
vacuum. This can be useful to us if we are working with Gell-Mann - Low theorem, for example.
Question 144. Consider G, defined in the previous question, as a function of q 2 , where
q := q1 + · · · qr = −qr+1 − · · · − qn .
Here, 1 ≤ r ≤ n − 1. The central result of this theorem is that G has a pole at q 2 = −m2 , where m
is the mass of any one-particle state that has a non-vanishing matrix element between the states
A†1 · · · A†r |Ωi and Ar+1 · · · An |Ωi. The residue at this pole is
p
X
−2i ~q2 + m2
7
M0|~qσ (q2 · · · qr )Mq~σ|0 (qr+2 · · · qn ),
(2π)
δ(q
+
·
·
·
+
q
)
G→ 2
1
n
q + m2 − i
σ
where
4
Z
e−iq1 x1 ···−iqr xr hΩ|T̂ {A1 (x1 ) · · · Ar (xr )}|Ψp~σ i.
M0|~pσ (q2 · · · qr ) × (2π) δ(q1 + · · · + qr − p) =
x1 ···xr
(We will see in the next question why the x1 and xr+1 disappeared. It has to do with absolute
versus relative coordinates.) Interpret the physical meaning of this theorem.
Answer 144. Wow, what a complicated theorem. What does it mean? Notice that with
q := q1 + · · · qr = −qr+1 − · · · − qn ,
we are conceptually splitting the external fields Ai into fields coming in (1 through r) and fields
going out (r + 1 through n). This holds if the times tr+1 , · · · , tn > t1 , · · · , tr , which is always
a possibility because we are integrating over all the xs and have to pass over this kind of timeordering. What if there was a Feynman diagram such that a single line connected the lines going
in and out? Then the total matrix element would be proportional to a propagator
1
k 2 + m2 − i
and it is not necessary that such a particle actually appears in the Lagrangian of the theory. It can
correspond to a bound state for this process. The pole arises not from single Feynman diagrams,
but rather from infinite sums of diagrams. This is a nonperturbative claim. Also good to know
is that the bound state does not necessarily correspond to a “single particle.” That’s not how
we should think about it. It could be like a bunch of particles having a existentially short-lived
orgy or something and then leaving the party, never to see each other again. But that party itself
constitutes the bound state, and we represent the bound state with a one-particle state. The
bound state is technically an infinite sum of diagrams made up of the fields in the Lagrangian.
Of course, there should be a bound state for every way of partitioning the external fields into “in”
fields and “out” fields.
86
Jimmy Qin
Notes on Weinberg’s QFT
Question 145. Outline the proof of this “polology” theorem. Where does the propagator k2 +m1 2 −i
arise in the mathematics?
Answer 145. If we are to think of the first r fields coming in and the last n − r + 1 fields going
out, we need to insert a δ-function to make the time-ordering come true.
Z
e−iq1 x1 ···−iqn xn θ(min[x01 · · · x0r ] − min[x0r+1 · · · x0n ])
G(q1 · · · qn ) ⊃
x1 ···xn
×hT̂ {A1 (x1 ) · · · Ar (xr )}T̂ {Ar+1 (xr+1 ) · · · An (x)n}i0 .
Because we are interested in virtual one-particle states, we can insert a not-complete basis of
one-particle states, ignoring intermediate states with two or more fields:
Z
G(q1 · · · qn ) ⊃
e−iq1 x1 ···−iqn xn θ(min[x01 · · · x0r ] − min[x0r+1 · · · x0n ])
x1 ···xn
Z
hΩ|T̂ {A1 (x1 ) · · · Ar (xr )}|Ψp~σ ihΨp~σ |T̂ {Ar+1 (xr+1 ) · · · An (x)n}|Ωi.
×
p
~σ
Now some true voodoo is going to happen. We turn the Heaviside function into a function with
a pole, like in regular Feynman propagator,
Z ∞ −iωτ
1
e
θ(τ ) = −
dω.
2πi −∞ ω + i
To turn the ω1 into something in terms of the fields and what we know, we switch to relative
coordinates. The overall center coordinate is arbitrarily chosen to be x1 for the incoming particles
and xr+1 for the outgoing particles. We need translational invariance for both sets of particles.
Integration over the “center” coordinates x1 and xr+1 then gives the energy-conserving δ-functions
p
p
0
+ · · · + qn0 ).
δ( p~2 + m2 + ω − q10 − · · · − qr0 ) and δ( p~2 + m2 + ω + qr+1
As you can see here, the matrix element will be nonzero if only if momentum is conserved.
However, if momentum is conserved, the matrix element is not necessarily nonzero. It means that
the scattering process does care whether the intermediate one-particle state is Al Gore or Jerry
Falwell, even if they weigh the same and have the same momentum. Then we use
p
−2 p~2 + m2
1
p
→ 2
p + m2 − i
p0 − p~2 + m2 + i
and this gives the result.
Why is it that we cared about the one-particle intermediate states and not, say, two- or threeparticle intermediate states? Apparently it is because multiple-particle intermediate states produce
branch points in p, which are gentler than poles. This makes sense if you think about something
like
1
1
.
0
2
2
0
2
(p − p ) + m (p ) + (m0 )2
Here, there is certainly no obvious pole in p. There is a pole in p0 , but that momentum is considered
to be related to a virtual particle only rather than related to the momenta of the incoming and
outgoing particles, p.
87
Jimmy Qin
Notes on Weinberg’s QFT
In retrospect, Weinberg maybe
didn’t need to phrase things in position-space. We could have
R
avoided writing so many x1 ···xn integrals. The only value of the position-space integration is to
end up with momentum-conserving δ-functions.
Question 146. Explain how the “polology” of the nonrelativistic Yukawa interaction, which is an
approximation to the residual strong force between nucleons, suggested the existence of π mesons.
Answer 146. Yukawa potential in nonrelativistic theory gives a scattering like (no integrations
over energy because we assume everything is on-shell)
Z
0
0
0
0
e−i(~x1 ·~p1 +~x2 ·~p2 )+i(~x1 ·~p1 +~x2 ·~p2 )
~
x1 ~
x2 ,~
x01 ~
x02
e−mπ |~x1 −~x2 |
δ(~x1 − ~x01 )δ(~x2 − ~x02 )
2π|~x1 − ~x2 |
= −(2π)3 δ(~p1 + p~2 − p~01 − p~02 )
1
(~p1 − p~01 )2 + m2π
.
The last term is the nonrelativistic limit of the regular QFT propagator, i.e. for low energy so
the energy term disappeared, since for nucleons N , 2m1N (~p21 − (~p01 )2 ) is smaller than |~p1 − p~01 | due
to the large mass. Another way to think of it is the static limit; this also happens to be how you
1
get the Coulomb interaction q12 from the more general retarded interaction ω2 −q
2.
Based on this, Yukawa proposed the existence of a meson field with mass around 100 MeV. The
viewpoint at that time was that the Lagrangian was thus far incomplete and they were missing a
meson field in the theory. The modern interpretation, based on polology, is that while a theory of
mesons is one way to understand things, that there is not an absolute need to throw in another
field to account for what looks like a virtual particle of mass mπ . In the modern interpretation, the
π meson, or pion, is interpreted as a bound state of two quarks or something. It does not have to
appear, itself, in the fundamental Lagrangian. Of course, you can phenomenologically introduce
such a field in an effective Lagrangian. This is nice because pions turn out to be unstable with very
short lifetime. It would not be honest to introduce them as “true fields” in the Lagrangian, because
that would imply that you could have a free-particle state made of pions. But that free-particle
state would die out on its own, which doesn’t make sense; hence polology is a nice way to account
for particles that mediate exchange but typically aren’t reagents or products in themselves.
10.3
Field and mass renormalization
In this section, we will study the renormalization of masses and fields. Of course, charges, coupling
constants, etc. also need to be renormalized. The main point of this section is
“The renormalization of masses and fields [and couplings] has nothing directly to do with the
presence of infinities and would be necessary even in a theory in which all momentum space
integrals were convergent.”
Question 147. What is a renormalized field ?
Answer 147. A renormalized field is a field “whose propagator has the same behavior near its
pole as for a free field.” The renormalized mass is “defined by the position of the pole,” and is
generally different from the mass of the bare field.
88
Jimmy Qin
Notes on Weinberg’s QFT
We will see in this section that if |q, σi is the one-particle state of a field (renormalized or not), it
is convenient to define a renormalized field as one whose normalization satisfies
hΩ|Ψl |q, σi =
ul (qσ)
.
(2π)3/2
This is always satisfied by bare fields in free theories. However, if we have interactions, then this
will not be satisfied by the bare fields. We will find it convenient to introduce renormalized fields,
which are obtained, as we know, with self-energy diagrams.
This is the point of “having the same residue.” In a sense, the residue is the number of particles carried by the field. We would like to have the same number of particles, but just with a different dispersion, so we demand physically that the number of particles is equal to one. See the discussion at
https://www.physicsoverflow.org/15999/renormalization-condition-must-the-residue-the-propa
Question 148. How does the renormalized field appear in a simple theory? How can I see its
connection to the 1PI graphs, or equivalently, to the self-energy diagrams?
Answer 148. Consider the Lagrangian density
1
L = − φ( − m2 )φ − V (φ).
2
Here, φ, m, V are all bare quantities. However, there is an interaction. Due to the interaction, φ
may not be normalized according to the convention in the previous question. The solution is to
introduce a renormalized field and mass,
φ̃ = Z −1/2 φ and m̃2 = m2 + δm2 ,
such that φ̃ is a renormalized field whose propagator has a pole at m̃2 . Then we write
L = L0 + L1
where
√
where Ṽ (φ̃) = V ( Zφ).
1
L0 = − φ̃( − m̃2 )φ̃,
2
1
Z
L1 = − (Z − 1)φ̃( − m̃2 )φ̃ + δm2 φ̃2 − Ṽ (φ̃),
2
2
Question 149. How can I find Z and m̃ for the renormalized field in the scalar theory?
Answer 149. Weinberg starts with the Lagrangian for the renormalized field to find these things.
It seems a little weird to find Z and m̃ from a theory which already contains these parameters. I
guess it works because actually m̃ hasn’t been defined yet. In a sense, we will define it only after
solving for Z (more on this below!).
Let us denote the self-energy, or sum of 1PI graphs, as Σ(q 2 ) = i(2π)4 Π(q 2 ). Important point:
these “1PI” graphs actually include an insertion of vertices (i.e. a single line which is not the
propagator) plus the regular loop contributions:
Π(q 2 ) = −(Z − 1)(q 2 + m̃2 ) + Zδm2 + Πloop (q 2 ).
89
Jimmy Qin
Notes on Weinberg’s QFT
There is no 12 in the Π(q 2 ) because of the 2!1 kind of thing on the vertices. Also, the extra (2π)4
normalization in front is to match −i(2π)−4 q2 +m̃1 2 −i , or rather, to cancel out the (2π)4 thing when
we write a geometric series (sneaky). The renormalized propagator is just
∆0 (q) =
1
q 2 + m̃2 − Π(q 2 ) − i
.
Remember that the definition of renormalized mass m̃ is the pole of ∆0 (q). If it smells like a
normal propagator, it should also have a unit residue, like the bare propagator. This implies that
Π(−m2 ) = 0 and ∂q2 Π(q 2 )
= 0.
q 2 =−m2
The solution is therefore
, Zδm2 = −Πloop (−m2 ).
Z = 1 + ∂q2 Πloop (q 2 )
q 2 =−m2
These solutions are explicit. There is no Z- or δm2 -dependence in Πloop .
For the Dirac field, as another example, the Lagrangian contains a m̃ rather than m̃2 . Thus,
renormalization is more convenient with m̃ = m + δm rather than m̃2 = m2 + δm2 .
Question 150. How is the calculation of Z and m̃2 above related to the things like “minimal
subtraction scheme” and whatever from Schwartz QFT ?
Answer 150. Usually, people are not so careful when setting up the idea of renormalized field.
An equivalent, and quicker, way to do things is to simply calculate
Πloop (q 2 )
and say “okay, perhaps Πloop (q 2 ) has some infinities, but I can subtract a linear polynomial in q 2
by choosing coefficients to satisfy the condition Π(−m2 ) = 0.” This is equivalent to subtracting
out the infinities by hand. It is the reason why Z is formally infinite.
Unfortunately, it’s not yet clear to me why the infinities should be limited to q 2 terms. Hopefully
we will find out soon.
Question 151. Explain how our understanding of polology from the last section can help us
understand the need for field renormalization. What does this have to do with the LSZ reduction
formula?
Answer 151. Let us describe the idea of LSZ reduction formula in very heuristic terms. Then,
we will see what it has to do with renormalized fields.
The idea of LSZ reduction formula is as follows. Consider a Lagrangian L and let Ol (x) by
a Heisenberg-picture operator living in an irrep l with the same Lorentz-transform properties as
the free field ψl . Ol (x) may be totally unrelated to the fields in L .
Usually, we calculate scattering processes and matrix elements with the fields in the Lagrangian.
Let me call them ψL . LSZ claims that we can develop Feynman rules to calculate the same
scattering processes and matrix elements for L , except by using the Heisenberg operators Ol (x)
90
Jimmy Qin
Notes on Weinberg’s QFT
which have nothing to do with L . We can think about this as follows: you have a girlfriend who
will be late, so she asked you to make lunch. However, the red line subway broke down and you
will also be late. Rather than starve and not have her matrix element, your girlfriend outsources
all elements of the task to various sentient beings. Girlfriend’s girlfriend buys some ginger ale from
CVS and girlfriend’s other friend calls her mom who sends her dog to tell her husband to text his
coyote to run to Felipe’s and steal some burritos, which are brought back to the dorm via carrier
pigeons controlled by the CIA. In the end, she gets a lunch. It is the same scattering process as
if she just calculated things with the fields in the original Lagrangian (i.e. if I bought burritos).
As you can imagine, this is a very deep theorem. Perhaps it is not so surprising, since it’s kind of
like inserting lots of intermediate states in linear algebra.
The LSZ formula gives an explicit form for the “propagator” of Ol , or more precisely, the propagator of ψl , which has the same Lorentz properties as Ol .
Question 152. Outline the proof of the Lehmann-Symanzik-Zimmerman reduction formula. What an intimidating name!
Answer 152. In the notation of the previous section, suppose r = 1. This means that, when we
group the external operators 1 through n into “in” and “out” categories, there is only one operator
in the “in” category. Let the “in” operator be Ol , where l represents an irrep of the Lorentz group,
and let the “out” operators by Ai (xi ). Let us write the corresponding matrix element as
Z
Gl (q1 q2 · · · ) =
e−iq1 x1 −iq2 x2 ··· hΩ|T̂ {Ol (x1 )A2 (x2 ) · · · }|Ωi.
x1 x2 ···
Ol means that O is a Heisenberg-picture operator with the same Lorentz transform properties as
a free field ψl belonging to an irrep l of the Lorentz group. It could be that Ol is some complicated
function of the fields, which happens to transform in the l-irrep. The polology theorem of the
previous section suggests that if there is a one-particle state |Ψq1 σ i having nonzero matrix elements
with the “in” and “out” states, then we can write
hΩ|Ol (0)|Ψq1 σ i = N hΩ|ψl |Ψq1 σ i = (2π)−3/2 N ul (q1 σ),
by Lorentz-invariance. The result is
p
−2i q21 + m2 X
ul (q1 σ)u∗l0 (q1 σ)Ml0
Gl (q1 q2 · · · ) → 2
q1 + m2 − i σl0
where Ml0 is a complicated matrix
element and ul0 means the same kind of operator but for
P
∗
a different irrep. The sum
σl0 ul (q1 σ)ul0 (q1 σ) looks kind of like the spin sum on top of a
propagator. Neat!
10.4
Renormalized charge; Ward-Takahashi identity
As Girma Hailu likes to tell people, the electron charge is a function of energy scale. At asymptotically low energy it is e = 1.6 × 10−19 C and at higher energy it is different. We will see how to be
consistent with defining the charge of a renormalized field. This is called renormalized charge,
but the idea is broadly applicable to any kind of global symmetry, not just charge conservation.
91
Jimmy Qin
Notes on Weinberg’s QFT
Recall that the Ward identity is a generalization of ∂µ J µ to multiple external currents. The idea
of renormalized charge gives us a more general and nonperturbative Ward identity, called the
Ward-Takahashi identity.
Question 153. Explain why the charge needs to be renormalized. Describe how to do so.
Answer 153. Let Q be the charge operator and let F (y) be a local function of the fields y. If a
matrix element hΩ|F (y)|pσni conserves charge (we saw this matrix element was important in the
LSZ reduction theorem), then
QhΩ|F (y)|pσni = 0 =⇒ qn = qF .
Of course, qF is the sum of the electric charges of the fields being multiplied together to give F (y).
So, we have charge conservation. However, charge conservation holds under a global rescaling of
all charges, qn → αqn , so it looks like we have too much freedom. To fix this, we say the “physical
electric charges are those that determine the response of matter fields to a given renormalized
electromagnetic field Aµ .” In other words, we demand the electron-photon coupling has the form
L ⊃ ψ̄l (∂µ − iql Aµ )ψl .
δL
.
This is convenient because the current retains the form J µ = δA
µ A=0
If the renormalized field is õ = √1Z3 Aµ , we should define
q̃l =
p
Z3 ql .
√
The physical electric charge is q̃l , and the proportionality constant Z3 is the same for all particles
(obviously, since it renormalizes the photon, which is not a charged particle).
Question 154. What does the above theorem tell us about the charges of different particles, at
various energy scales?
Answer 154. The charges of the proton and electron are exactly opposite at all energy scales,
because both change only by √1Z3 , which is the same for each (because it’s for the photon!). See
also: Schwartz QFT section 19.5.
This means that there must be cancellations among the radiative corrections to the propagators
and vertices of the charged particles. These cancellations are described by the Ward-Takahashi
identity. We derive it in the next question.
Question 155. What is the Ward-Takahashi identity? Describe its connection to the less general
∂µ J µ , and explain how the more general form is related to Noether theorem for the global U (1)
symmetry. Elucidate why it means there are no radiative corrections to the charge due to the
electron-photon vertices.
Answer 155. Let S 0 (k) and Γµ (k, l) be the dressed electron propagator and electron-photon
vertex, for k µ -momentum electron coming in and a lµ -momentum electron coming out. The q → 0
limit is obviously
1
S 0 (k) →
and Γµ (k, l) → γ µ .
ik/ + m − i
92
Jimmy Qin
Notes on Weinberg’s QFT
The Ward-Takahashi identity is
(l − k)µ Γµ (k, l) = i(S 0−1 (k) − S 0−1 (l)).
In the l → k limit, this gives the original Ward identity which is just
Γµ (k, k) = −i
∂
∂ 0−1
S (k) =⇒ Γµ (k, k) = γ µ + i
Σ(k).
∂kµ
∂kµ
The claim is that Ward-Takahashi identity ensures the radiative corrections to the vertex function
Γµ cancel when a fermion on the mass shell interacts with an electromagnetic field with zero
momentum transfer (such as when we measure the electric charge). That is because on the massshell,
ū0k Γµ (k, k)uk = ū0k γ µ uk .
This follows by using Dirac equation for the spinors, and from ∂Σ
∂/
k
= 0.
/=im
k
The conclusion is that only the renormalization of the photon propagator makes the charge flow.
Neither the renormalization of the electron propagator or the electron-photon vertex change the
charge; their effects are “cancelled out,” so to speak, via Ward-Takahashi. In this sense, propagator
changes by something like Σ(k) and the vertex changes by i ∂k∂µ Σ(k), and these changes turn out
to be exactly opposite.
Question 156. Derive the Ward-Takahashi identity.
Answer 156. The dressed electron-photon vertex can be written
T̂ {J µ (x)Ψn (y)Ψ̄m (z)}.
Ward-Takahashi identity follows from
∂µ T̂ {J µ (x)Ψn (y)Ψ̄m (z)} = T̂ {(∂µ J µ (x))Ψn (y)Ψ̄m (z)}
+ δ(x0 − y 0 )T̂ {[J 0 (x), Ψn (y)]Ψ̄m (z)} + δ(x0 − z 0 )T̂ {Ψn (y)[J 0 (x), Ψ̄m (z)]}.
~ x doesn’t act on y or z, but the time-ordering means the quantity changes
This is because ∇
nontrivially when x0 = y 0 , for example. The terms with δ-functions are called contact terms
and are described in Schwartz QFT as not contributing to the scattering amplitude, because it
doesn’t have poles. We know ∂µ J µ = 0 and also
J µ = −i
X
l
∂L
ql Ψl =⇒ [J 0 (x, t), Ψl (y, t)] = −ql Ψl (y, t)δ(x − y).
∂(∂µ Ψl )
Then we insert the result into
µ
0
0
− i(2π) qSnn
0 (k)Γn0 m0 (kl)Sm0 m (l)δ(p + k − l) =
4
Z
e−i(px+ky−lz) hΩ|T̂ {J µ (x)Ψn (y)Ψ̄m (z)}|Ωi.
xyz
The result is the Ward-Takahashi identity. How can we understand this? Ward-Takahashi identity
/ are renormalized in the same way, so
means that the kinetic term Ψ̄∂/ Ψ and the interaction Ψ̄AΨ
93
Jimmy Qin
Notes on Weinberg’s QFT
/
they can be combined into a covariant gauge transform term, Ψ̄DΨ.
In this sense, the electric
/
charge doesn’t change because it’s kind of a ratio between ∂/ and A.
Another way to understand the derivation of Ward-Takahashi identity is as follows: note that for
any operators · · · , we have
∂µ hΩ|T̂ J µ (x) · · · |Ωi = contact terms.
Because the contact terms do not contribute to the scattering amplitude, we find that
kµ Mµ = 0,
where Mµ is the matrix element represented by the correlation function (or rather, the Fourier
transform of the time-ordered quantity). Actually if we include the renormalization factors explicitly, it looks like
(k − l)µ Γµ (k, l) = Z2−1 Z1 e S 0 (k)−1 − S 0 (l)−1
and since Γµ and S 0 should be finite, this suggests Z1 = Z2 , which Schwartz tried to prove
in his book. Here, Z1 is the renormalization factor on the electron propagator and Z2 is the
renormalization factor on the electron-photon vertex. See http://www.physics.indiana.edu/
~dermisek/QFT_08/qft-II-15-1p.pdf.
10.5
Gauge invariance of the S-matrix
We claim that the S-matrix is gauge-invariant. This means the following: let a photon propagator a matrix element be ∆µν (q) and consider the gauge transform
∆µν (q) → ∆µν (q) + αµ qν + qµ βν
or the redefinition of polarizations
µ (kλ) → µ (kλ) + ckµ ,
where αµ , βν , c don’t have to be constants, and don’t have to be the same for all photons in the
theory.
There will be some radiative corrections to the photon propagator. The important thing about
gauge invariance of the S-matrix is that the dressed photon propagator is still massless. This is a
really nontrivial thing, since usually the mass flows with energy scale.
Question 157. Outline a proof of the gauge-invariance of the S-matrix.
Answer 157. This follows immediately by writing
Z
µ1 µ2 ···
Sβα ∝
∆µ1 ν1 (q1 )∆µ2 ν2 (q2 ) · · · [eµ1 (k1 λ1 ) · · · ][Mba
(momenta)]
q1 q2 ···
and use the result of the Ward identity
µ···
pµ Mba
= 0.
94
Jimmy Qin
Notes on Weinberg’s QFT
This is not very clear in the book. I think the logic is kind of like this: pµ M µ··· is the generalization
of Noether theorem ∂µ J µ , but it seems to hold even off-shell. That is because the gauge transform
Aµ → Aµ + ∂µ is supposed to leave the matrix element
Z
0 0
0
µµ0 ···
0
Mβα (q, q , · · · ) =
e−i(qx+q x +··· ) hβ|T̂ {J µ (x)J µ (x0 ) · · · }|αi
x,x0 ,···
invariant even if the photons coupling to the external currents are off-shell.
We can build the S-matrix out of these matrix elements, plus some electron propagators and
electron-photon vertices and stuff like that. Because the matrix elements are gauge-invariant even
when they are off-shell, so is the S-matrix.
A reflection: I think this is really interesting, since Noether’s theorem ∂µ J µ is an on-shell theorem,
but this identity ∂µ M µ··· seems to hold off-shell as well, due to gauge invariance. I think this is
because gauge invariance is the statement of a local symmetry, so we can say more powerful things
if we have a local symmetry, compared to if we only have a global symmetry.
Here is a short derivation by Arkani-Hamed of the identity in question: http://pages.physics.
cornell.edu/~ajd268/Notes/Ward_Identity.pdf. However, it is not the best because he uses
an equation of motion for Aµ , which only holds on-shell. A really good resource is here, http:
//bolvan.ph.utexas.edu/~vadim/Classes/16f/WTI.pdf. It states on the first page that the
identities hold for “off-shell amplitudes.”
Question 158. Outline a proof of the following result:
Radiative corrections do not give the photon a mass.
Answer 158. Consider the 1PI corrections Π(q), such that
−1
∆0µν (q) = ∆(q)−1 − Π(q) µν .
This can be rewritten as
∆0µν (q) = ∆µν (q) + ∆µρ (q)Πρσ (q)∆0σν (q).
Gauge invariance must hold for dressed propagators as well as for bare propagators. We know this
is true because of the theorem on matrix elements, which can be thought of as bare propagators
with lots of vertices in the middle, or dressed propagators with fewer vertices. This restriction
with Lorentz invariance gives
qρ Πρσ (q) = 0 =⇒ Πρσ (q) = (q 2 η ρσ − q ρ q σ )π(q 2 ).
We can find the inverse:
∆0µν (q) =
˜ 2 )qµ qν /q 2
ηµν − ξ(q
˜ 2 ) = ξ(q 2 )(1 − π(q 2 )) + π(q 2 ).
where ξ(q
(q 2 − i)(1 − π(q 2 ))
This is kind of interesting. It looks like the dressed propagator differs from the bare propagator
by a momentum-dependent overall constant and a different gauge choice. Here is the crux: Π(q)
is a sum of one-photon irreducible graphs, so it does not have a pole at q 2 = 0. Otherwise there
would be a matrix element with a massless “bound state” which could be represented as a photon
95
Jimmy Qin
Notes on Weinberg’s QFT
propagator. Thus, the pole of the dressed propagator remains at q 2 = 0, since there is no other
pole at q 2 = 0 to get rid of the existing one.
This isn’t particularly watertight. Physically we can argue that the remaining pole at q 2 = 0
implies a unique mass m = 0; however, we did not show there are no new poles. This would be
really confusing because we would not know which pole to use for the physical mass!
Question 159. Show that the renormalization factor Z3 can be related to π(q 2 ) from the previous
question via
Z3 = 1 + πloop (q 2 = 0).
In practice, we just calculate the loop contributions and subtract a constant to make π(0) vanish.
Answer 159. This follows because ∆0 is supposed to contain all diagrams. Therefore sum about
the radiative corrections to ∆0 , they should not change the propagator and we should get ∆0 back
(kind of like a fractal?).
The gauge-invariant part of the propagator,
∆0µν (q) =
ηµν
(q 2 − i)(1 − π(q 2 ))
should thus have a residue which is independent of radiative corrections. Therefore π(0) = 0.
If we write the Lagrangian as
1
1
L = − F 2 + (1 − Z3 ) F 2 + Lmatter ,
4
4
then the self-energy is like
π(q 2 ) = 1 − Z3 + πloop (q 2 )
because the factor 1 − Z3 multiplies the q 2 η ρσ − q ρ q σ if you write the Lagrangian as L = Aµ X µν Aν
or whatever. Setting q 2 = 0 gives the desired result.
A good question is: why is the residue of the photon pole important? Another way of phrasing
this question is why the propagator should be invariant under more radiative corrections at q 2 = 0
(on-shell), but not necessarily for q 2 > 0 (off-shell). I think this is a physical condition which
we can understand as coming from the LSZ theorem applied to a photon propagating by itself in
empty space. From Wikipedia, “[LSZ] asserts that S-matrix elements are the residues of the poles
that arise in the Fourier transform of the correlation functions as four-momenta are put on-shell.”
10.6
More on the LSZ reduction formula
This is not a section of Weinberg, but I feel that LSZ is both important and not something I
understand very well. So, I will study it in more depth than Weinberg gives.
I should say what I skipped in this chapter of Weinberg: Kallen-Lehmann (spectral) representation,
dispersion relations. I think none of these topics are too difficult so I can always read them later.
(By the way, apparently “none” is plural when referring to “not any,” so in the previous sentence
I correctly used “are” instead of “is.”) The section on form factors, which is in Weinberg Ch 10,
is moved to Ch 11 of my notes. Weinberg put it in Ch 10 because it is a nonperturbative form,
but I put it in Ch 11 because they are useful in renormalizing QED; either organization is fine.
96
Jimmy Qin
11
Notes on Weinberg’s QFT
One-loop radiative corrections in QED
We renormalize QED to one-loop. We will perform the following calculations: (1) vacuum polarization, which renormalizes the photon propagator (2) anomalous magnetic moment of the
α
correction (3) electron self-energy, which renormalizes the electron propagaelectron and the 2π
tor. Before doing (2), we will take a diversion and learn about form factors, whose properties are
nonperturbative.
Here are the tools we will use:
• Feynman parameters
• Wick rotation
• Dimensional regularization (due to t’Hooft and Veltman)
• Pauli-Villars regularization
I will not worry about the calculations. They are done in great detail, for example, in Schwartz
QFT with possibly more organized and polished methods. I am more interested in the motivation
and the exact procedures of counterterms, etc.
11.1
Counterterms
The point of counterterms is that the calculations will give infinities, but the final results are finite
“if expressed in terms of the renormalized charge and mass.” A counterterm generically looks like
(Z − 1) × (some term in the renormalized fields).
The Z − 1 will be formally infinite. We will understand this better in the next sections.
Introduce the renormalized fields (the B subscript means “bare”):
p
1
1
ψ = √ ψB , Aµ = √ AµB , e = Z3 eB , m = mB + δm
Z2
Z3
We must take the renormalizations of e and Aµ to be opposites because of the combination
/
ψ̄(∂/ + ieA)ψ.
If Aµ got renormalized, then we must change e by the opposite factor in order for
the response of the charged particles to a renormalized electric field to be the same. This is the
same thing as something we proved in the last chapter: charge renormalization arises only from
radiative corrections to the photon propagator.
Let us rewrite the QED Lagrangian in the renormalized fields and the counterterms:
L = L0 + L1 + L2
where
1
L0 = − F 2 − ψ̄(∂/ + m)ψ
4
/
L1 = −ieψ̄ Aψ
97
Jimmy Qin
Notes on Weinberg’s QFT
1
/
L2 = − (Z3 − 1)F 2 − (Z2 − 1)ψ̄(∂/ + m)ψ + Z2 δmψ̄ψ − ie(Z2 − 1)ψ̄ Aψ.
4
Apparently, “all the terms in L2 are of O(e2 ) and higher, and these terms just suffice to cancel
the ultraviolet divergences in the sum of the loop graphs.” The point is that L0 + L1 is the
free field Lagrangian plus the regular electron-photon coupling, and the L2 are the counterterms.
(Note: it’s not obvious that L2 should be enough to cancel all the divergences. Often, if you have
a Lagrangian and want to write down the counterterms, you have to make sure you include all
possible scalars in the Lagrangian which could contribute, not just the scalars in the original bare
Lagrangian.) Let’s see how this plays out.
11.2
Vacuum polarization
The vacuum polarization is the virtual cloud of an electron-positron pair which is the leadingorder correction to the photon propagator. It produces measuable shifts in the energy levels of
hydrogen, etc.
As in the previous chapter, define i(2π)4 Π∗µν (q) as the 1PI sum with polarization indices µ and
ν. The complete photon propagator ∆0 is
πν
∆0µν = ∆µπ (1 − Π∗ ∆)−1 .
(1)
Question 160. What is the leading-order contribution to the 1PI energy?
Answer 160. The leading-order correction is something like
Z
−ip/ + m
i(p/ − /q) + m
−i
−i
4
µ
4
ν
4 ∗µν
× (2π) eγ ×
× (2π) eγ .
i(2π) Π (q) = − Tr
(2π)4 p2 + m2 − i
(2π)4 (p − q)2 + m2 − i
p
To preserve the Ward identity qµ Π∗µν (q) = 0, it is best to use t’Hooft and Veltman’s dimensional
regularization. See Schwartz QFT or Weinberg for more details.
Question 161. Suppose we have already calculated the value of the virtual polarization bubble.
How do we get rid of the divergence?
Answer 161. The idea is that the loop diagram is not the only O(e2 ) contribution to the 1PI
energy; there is also the counterterm. Adding these two contributions together suggests
Π∗µν (q) = (q 2 η µν − q µ q ν )π(q 2 )
where
d 4e2 Ωd d
Γ( )Γ(2 − )
π(q ) = −
4
(2π)
2
2
2
Z 1
d−4
dxx(1 − x)(m2 + q 2 x(1 − x)) 2
− (Z3 − 1).
0
Now we use the result from section 10.5 of these notes that the gauge-invariant part of the propagator,
ηµν
∆0µν (q) = 2
(q − i)(1 − π(q 2 ))
98
Jimmy Qin
Notes on Weinberg’s QFT
should have a residue which is independent of radiative corrections (due to the fractal nature of
the loop graphs). Therefore π(0) = 0. This gives us the value of Z3 , at least to O(e2 ). Solving for
Z3 and then substituting it back into the self-energy gives
Z 1
2
d−4
d
4e2 Ωd d
2
2 d−4
2
2
2
dxx(1
−
x)
(m
+
q
x(1
−
x))
.
)Γ(2
−
)
−
(m
)
π(q ) = −
Γ(
(2π)4 2
2 0
Let us see how this form leads to the cancellation of the poles. The regularization is removed by
taking d → 4.
• Divergence of the Γ-function: There is a divergence
d
1
Γ(2 − ) →
− γ,
2
2 − d2
where γ is the Euler-Mascheroni constant. However, for d → 4 we have
d−4
d−4
(m2 + q 2 x(1 − x)) 2 → 1, (m2 ) 2 → 1.
The limiting behavior of the limits just above is faster than the limiting behavior of the Γfunction, which is easy to check (or to intuit). Therefore, the divergences of the Γ-function
have been regulated.
This is called a “cancellation of poles” because of the pole d = 4 in 2−1 d .
2
• Many cancellations of finite parts: Many finite parts cancel between the loop diagram and
the counterterm, Z3 − 1. This is not really because we “wanted” them to cancel, but rather
because if we choose the correct value of Z3 − 1 for π(q 2 = 0) = 0, then they turn out to
cancel anyway. These finite parts include
– The term multiplying γ.
– The product of the pole in Γ(2− d2 ) and the term (4−d) ln µ, where µ is the dimensionful
generalization of the coupling constant e.
However, there is a finite part which changes with momentum q. It is the product of the pole in
d−4
d−4
Γ(2 − d2 ) and the linear term in the expansion of (m2 + q 2 x(1 − x)) 2 and (m2 ) 2 in powers of
d − 4. This gives
Z 1
e2
q 2 x(1 − x)
2
π(q ) = 2
dxx(1 − x) ln(1 +
). (Final result)
2π 0
m2
Let’s summarize this procedure. We fixed the value of Z3 −1 by demanding a physically meaningful
condition on the residue of the pole of the photon propagator. This happened to be at q 2 = 0
for the photon, but presumably for a massive propagator it would have been at the renormalized
mass, q 2 = m̃2 .
After fixing Z3 − 1, we found that the infinite part (Z3 − 1)∞ cancels the infinity in the loop
diagram corresponding to an incoming photon at any momentum (including off-shell momenta).
Essentially, this is because a finite change in momentum of the photon doesn’t change the character
of the divergence; it does, however, change the finite part of the loop diagram.
99
Jimmy Qin
Notes on Weinberg’s QFT
However, we found that this finite change in the loop diagram is related to the pole in Γ(2 − d2 ).
So, it’s not like the pole is unimportant and gets totally subtracted out. The divergent nature of
the pole is subtracted out, but the pole gets multiplied by other things which can both (1) get rid
of the divergence (2) change with photon momentum q.
To summarize, the pole is like a behind-the-scenes kind of thing; we never see a pole directly,
but it is the important part of the loop diagram in the sense that only the divergence of the pole
contributes to a q-dependence of the self-energy. Behind every Sultan is an evil divergent Jafar;
regularization tames the out-of-control vizier running the kingdom.
Question 162. Describe the physical effects of the radiative correction to the photon propagator.
Answer 162. First of all, these radiative corrections do not give the photon a mass. We proved
this in section 10.5 of these notes, and it is also obvious here because the pole of the propagator
is still located at q 2 = 0.
Heuristically, here is what happens: suppose we have an external electron Alice interacting with
an external positron Bob; they interact by shooting photons at each other. The electron-positron
virtual particle pair in the photon tends to have a particular orientation relative to Alice and Bob;
namely, it tends to shield Alice and Bob from each other. So the virtual positron is more likely
to be on Alice’s side, and the virtual electron is more likely to be on Bob’s side. This is the same
as if Alice and Bob were interacting with the original photon propagator, but their charges were
weaker. Indeed, this is the entire idea of the scale-dependence of the electronic charge.
It is shown in Weinberg (or, you can guess by taking the static limit of the propagators and
expanding (1) to leading order in the self-energy) that the new Coulomb potential is like
V (q) =
e1 e2 1 + π(q2 )
.
(2π)3
q2
Correspondingly, the renormalized electronic charge is
eB
e= √ ,
Z3
where eB is the (infinite) bare electronic charge. We know from Girma Hailu that eB → ∞.
(Note: based on e = √eZB3 , it might seem that there is no scale-dependence of e. It depends on
R
what you are talking about; if you cut off the integral k at some scale Λ, then you would get a
scale dependence of Z3 and hence of e as well. This is the idea of Wilson’s RG.)
Question 163. Describe how this produces measurable energy level changes in the hydrogen or
muon atomic energy levels.
Answer 163. The corrected Coulomb potential reduces to the regular Coulomb potential quite
quickly,
1 + π(q2 )
1
→ 2 for |q| < m,
2
q
q
where m is the electron mass (it was part of the loop calculation!). The electrons are not very
massive, so the correction is important only when |q| is large, or equivalently only when the
separation between two charged particles |r| is small.
100
Jimmy Qin
Notes on Weinberg’s QFT
If you think about the Hydrogenic wavefunctions Rnl (r), they are nonzero around the origin only
for l = 0. Hence, the radiative corrections to the Coulomb potential tend to break the l-degeneracy
of energy levels Enlm , and especially so for l = 0 compared to l 6= 0. This is called the Uehling
effect and the mathematical answer is given by Weinberg (11.2.42),
∆En,l=0 = −
4Z 4 α5 m
.
15πn3
People saw this in experiments, Phys Rev 48 55 (1935). That seems pretty early to me.
11.3
Interlude: Form factors
Suppose we want to calculate the scattering of a particle by an external EM field to first order
in the external field, but to all orders in all other interactions, such as the virtual corrections to
our particle’s propagation. We should make this general enough to handle when the external EM
field actually arises from the field of another charged particle, so the photon propagator can be
off-shell.
According to Gell-Mann - Low theorem (Weinberg section 6.4), if we introduce the external couplings l into
XZ
V (t) = V (t) +
l (x, t)ol (x, t),
l
then
x
δ r Sβα
= hβ|T̂ {−iOa (x), −iOb (x), · · · }|αi
δa (x)δb (y) · · · =0
where
Oa (x, t) = Ω(t)oa (x, t)Ω−1 (t) and Ω(t) = eiHt e−iH0 t and oa (x, t) = eiH0 t oa (x)e−iH0 t .
Question 164. What does the above Gell-Mann - Low theorem mean?
Answer 164. This is a nonperturbative identity. The LHS is equivalent to the sum of all Feynman
diagrams connecting the external sources oa , ob , · · · . The RHS is a matrix element. What this
says is if we can know something about this matrix element, then we will also know something
about the nonperturbative sum of all the diagrams. This is the idea of form factors.
Question 165. How can we exploit this very powerful identity to constrain the possible matrix
elements to a known form with unknown form factors?
Answer 165. Since the EM field is external, we take o = J µ (x) (so it must have been coupled to
an infinitesimal µ (x)). Our initial and final states are on-shell one-particle states (with the same
particle). So far, our matrix element is
0
hΨp0 σ0 |J µ (x)|Ψpσ i = ei(p−p )x hΨp0 σ0 |J µ (0)|Ψpσ i.
By itself, translational invariance will not give us form factors. We must use two other conservation
laws: Noether theorem and Lorentz invariance. Noether theorem gives
(p0 − p)µ hΨp0 σ0 |J µ (0)|Ψpσ i = 0, hΨp0 σ0 |J 0 (x = 0)|Ψpσ i =
101
qδσσ0 δp0 ,p
,
(2π)3
Jimmy Qin
Notes on Weinberg’s QFT
where q is the particle charge.
Using Lorentz invariance is harder because it involves the irrep of the particle, and hence depends
on the particle’s spin. We will consider this in the following questions.
Question 166. Use Lorentz invariance to derive the form factors for a spin-zero particle.
Answer 166. This is easy because we know that |Ψp i is an on-shell external state,
|Ψp i =
1
ap .
(2p0 )1/2
Therefore, we have
hΨp0 σ0 |J µ (x = 0)|Ψpσ i =
qJ µ (p0 , p)
.
(2π)3 (2p0 0 )1/2 (2p0 )1/2
J µ (p0 , p) must be a 4-vector function which satisfies
(p0 − p)µ J µ (p0 , p) = 0.
Therefore, J µ (p0 , p) must be some linear combination of p0 µ and pµ . It’s really easy to show that
since (p0 − p)(p0 + p) = 0, we must have
µ
J µ (p0 , p) = (p0 + pµ )F (k 2 ), where k 2 = (p − p0 )2 .
The important thing about k 2 is that it depends only on pµ p0 µ , which is the only nontrivial constant
you can make out of p and p0 . Due to the integration of scalar charge, we must have F (k 2 = 0) = 1.
In fact, you could write it only in terms of some F̃ (p · p0 ), but this is not as pretty?
Nothing is known about F except F (k 2 = 0) = 1. However, this function is not arbitrary; it is
completely determined by the theory. The point is that it is the only thing left to calculate.
Question 167. Use Lorentz invariance to derive the form factors for a spin- 21 particle.
Answer 167. This is similar to the simpler spin-0 case. We find
hΨp0 σ0 |J µ (x = 0)|Ψpσ i =
iq
ūp0 σ0 Γµ (p0 , p)upσ .
(2π)3
Γµ is a 4-vector 4 × 4 matrix function of p, p0 , γ. What does this mean? It means for each choice
of µ, we can write Γµ as a linear combination of the 16 covariant matrices (see section 5.4 of these
notes). Also, we have to make sure that
(p0 − p)µ ūp0 σ0 Γµ (p0 , p)upσ = 0.
The result is that
i 0µ
(p + pµ )G(k 2 ) and F (0) + G(0) = 1.
2m
The last equality again comes from the spatial charge condition. Often, people rewrite this as
Γµ (p0 , p) = γ µ F (k 2 ) −
i
Γµ (p0 , p) = γ µ F1 (k 2 ) + [γ µ , γ ν ](p0ν − pν )F2 (k 2 ) and F1 (0) = 1.
2
The relation between these two representations is
F (k 2 ) = F1 (k 2 ) + 2mF2 (k 2 ) and G(k 2 ) = −2mF2 (k 2 ).
102
Jimmy Qin
Notes on Weinberg’s QFT
Question 168. Explain why the magnetic moment of a charged spin- 21 particle is
µ=
q
F (0),
2m
q
without radiative corrections, which is Dirac’s result from the relativistic exwhich is µ = 2m
tension of Schrodinger equation. (Or is it? I thought there was a missing 12 due to the Thomas
precession. Or is it that the Dirac form correctly includes the Thomas precession, and it was only
the semiclassical one that didn’t?)
Answer 168. This follows from remembering that this matrix element is merely a sandwich
(roughly speaking) where the pieces of bread are the one-particle states and the good stuff in the
middle is
Z
J µ Aµ .
x
R
~
Then, we take just the B-field
part of Aµ and see what comes out. If Hint = − x J(x) · A(x), the
matrix element is like
hΨp0 σ0 |Hint |Ψpσ i = −
1
1
qF (0)
(J)σσ0 · Bδpp0 where J = σ for spin- .
m
2
2
This makes sense if you think of the interaction Hint as roughly proportional to a particle density
or something. So J is like a particle current from undergrad QM, and it couples to B like in
classical electrodynamics.
11.4
Anomalous magnetic moment
Now we calculate the shift in the magnetic moment of an electron. If you look at the previous
section of these notes, you can see how the magnetic moment is defined:
µ=
q
F (0).
2m
Here are the leading-order diagrams, to O(e2 ). As we will see, not all of them contribute:
We will ignore the corrections to the electron propagators; we have assumed the electrons are
on-shell, and physically we want the on-shell residue to remain the same (just as for the photon),
so the radiative correction is irrelevant. Anyway, we are trying to renormalize a vertex here and
not a propagator. We will not ignore the correction to the photon propagator, but it was done
in the previous section and we can put it in later. Therefore, we will only calculate the last loop
103
Jimmy Qin
Notes on Weinberg’s QFT
graph, which is
−i(p/0 − k/) + m
−i
× γµ
4 (p0 − k)2 + m2 − i
(2π)
k
−i(p/ − k/) + m
−i
−i
1
4
×
×
(2π)
eγ
×
.
ρ
(2π)4 (p − k)2 + m2 − i
(2π)4 k 2 − i
Γµloop (p0 , p) =
Z
(2π)4 eγ ρ ×
There are no external-particle propagators here, as expected.
Question 169. Summarize all the contributions to the vertex function Γµ .
Answer 169. There are a few contributions.
• The vanilla coupling γ µ . For a meditation on interesting things that are also vanilla, see
here.
• Γµloop (p0 , p), of course.
• The counterterm for the electron-photon vertex, (Z2 − 1)γ µ . This cancels the divergence in
Γµloop (p0 , p). To fix the cancellation, we will take p0 = p so that we can use π(0) = 0 in the
next bullet point. In fact, this will be an infrared divergence rather than a UV divergence.
You could guess this by looking at the powers of k in the integral for Γµloop (p0 , p).
• The radiative correction to the photon propagator,
Γµphoton (p0 − p) =
Πµν (q) × γν
, where q = p0 − p.
q 2 − i
This has already been regularized since it includes (Z3 − 1), so it is finite.
α
Question 170. Describe how to get Schwinger’s famous 2π
correction to the magnetic moment
of the electron.
Answer 170. Recall from the previous section that
(p0 − p)µ ūp0 σ0 Γµ (p0 , p)upσ = 0.
Also, the vertex is written
Γµ (p0 , p) = γ µ F (k 2 ) −
i 0µ
(p + pµ )G(k 2 ) where k = p − p0 .
2m
In fact, it’s not physically important what Γµ (p0 , p) is; the important quantity is ūp0 σ0 Γµ (p0 , p)upσ .
After some brutal calculations, the result is
Z
Z x Z ∞
4π 2 e2 1
0 µ
0
dx
dy
κ3 dκ
ū Γloop (p , p)u = −
(2π)4 0
0
0
0
µ
2
2 2
2
0µ
µ
ū γ (−κ + 2m (x − 4x + 2) + 2q (y(x − y) + 1 − x)) − 2im(p + p )x(1 − x) u
× (κ2 + m2 x2 + q 2 y(x − y))−3 .
104
Jimmy Qin
Notes on Weinberg’s QFT
This matches the form factor template, in the sense that there are only terms linear in pµ + p0µ ,
or constants.
If you add all the contributions (see previous question) together, you’ll find that F is IR-divergent
and gets regulated by Z2 . However, G is finite. Because F (0) + G(0) = 1 and we are interested
only in the k = 0 case, the path of least resistance is merely to calculate G(0). It turns out that
e2
= −0.001161.
8π 2
α
This is Schwinger’s 2π
result, Phys Rev 73 416 (1948).
G(0) = −
Question 171. Describe the idea of the electron’s “charge radius.” This is described in Weinberg
pp. 491-493.
Answer 171. The “charge radius” is how thick the electron’s clothes are. It depends on how hot
it is outside; the hotter it is, the less clothes the electron wears and the more “bare skin” you can
see.
I feel that defining a “charge radius” is really strange because everyone taught me that the crosssection of scattering off a Coulomb potential is infinite. So, is it that the electron cloud ends at
this “charge radius,” or that the electron cloud is infinitely large?
Anyway, on page 493, Weinberg defines this “radius” as having to do with the q → 0 limit of the
form factor F1 (q 2 ). It turns out that
a2 2
F1 (q ) → 1 − q as q → 0
6
2
where
2 3
−e2
µ2
a = 2 2 ln( 2 ) + +
4π m
m
5 4
is the “charge radius.” It is positive because µ is an IR cutoff and therefore µ m. According to
(11.3.27) and (11.3.28), the main contribution to this “radius” is from F (0) (of course, since we
found G(0) wasn’t divergent to one-loop). I should probably ask somebody how to think about
this.
2
11.5
Electron propagator
By now, we’re pretty good at renormalizing things and using counterterms. Let us do just one
more example. As you can imagine, the logic of renormalizing the electron propagator is the same
as that for renormalizing the photon propagator. It turns out that the loop diagram Σ∗loop has
both a UV and an IR divergence. We will regulate the UV divergence below, and it turns out that
the UV divergence by itself is enough to fix the counterterms. We will find out how to regulate
the IR divergence later.
Question 172. Explain how to regulate the UV divergence of the loop graph using the PauliVillars scheme.
Answer 172. Weinberg takes care of the divergence by using Pauli-Villars regularization on the
photon propagator,
1
1
1
→ lim
+ 2
.
µ→∞ k 2 − i
k 2 − i
k + µ2 − i
105
Jimmy Qin
Notes on Weinberg’s QFT
Question 173. Outline all the contributions to the electron self-energy. How do we fix the
counterterms?
Answer 173. There are the counterterms and the virtual-photon contribution, all at O(e2 ).
Schematically, all the O(e2 ) contributions put together are
Σ∗ (p) = −(Z2 − 1)(ip/ + m) + Z2 δm + Σ∗loop (p).
Here, m is the renormalized mass. We again use the physical condition that the renormalized
propagator ∆0 (p) should have a pole at ip/ = −m with residue unity. Location of pole + unit
residue is really two facts, so this gives us two equations:
δm = −Σ∗loop ip/=−m and Z2 − 1 = −i
∂Σ∗loop
.
∂ p/ ip/=−m
There are two counterterms here, but there was only one counterterm for the photon propagator.
This is because the photon mass stays at zero regardless of radiative corrections. However, the
electron mass can change, so there is one additional df, and hence one additional counterterm.
11.6
Problems
Question 174. Calculate the contributions to the vacuum polarization function π(q 2 ) and to Z3
of one-loop graphs involving a charged spinless particle of mass ms . What effect does this have
on the energy shift of the 2s state of hydrogen, if ms Zαme ?
Answer 174. TODO
Question 175. Consider a neutral scalar field φ with mass mφ and self-interaction 3!g φ3 . To
one-loop, calculate the S-matrix element for scalar-scalar (i.e. 2 → 2) scattering.
Answer 175. TODO
12
General renormalization theory
Question 176. What was the conclusion of Dyson’s paper Phys Rev 75 486, 1736 (1949)?
Answer 176. Dyson showed that when we express all parameters of a theory in terms of “renormalized” (i.e. physical) quantities, that the divergences cancel to all orders. Actually, this is true
only for the renormalizable theories.
Non-renormalizable theories can have their divergences cancelled as well, but we must use an infinite number of counterterms. This is allowed because there are an infinite number of interactions
allowed by symmetries.
This is definitely the easiest chapter in the book. Phew!
106
Jimmy Qin
12.1
Notes on Weinberg’s QFT
Degrees of divergence
We would like to estimate roughly how badly a loop graph is going to diverge. We can do this by
writing down the value of the diagram and seeing whether the integrals blow up at large or small
momentum. There is a more systematic way to do this, however, based on graph theory.
Question 177. What is the “superficial degree of divergence?” What is so superficial about it?
Answer 177. The superficial degree of divergence, D, is the divergence with k of the integral
Z
dkk D−1 .
It is superficial because this D is not always the true degree of divergence of the graph. This is
especially in the sense that for D = 0, you could have a log-divergent graph of a convergent graph.
It depends. Or, you could be sneaky and take one loop momentum to infinity, but not the other.
Along these lines, it has been shown by Weinberg Phys Rev 118 838 (1959) that the requirement
for actual convergence of any graph is that the degree of divergence should be D < 0 for not only
the complete graph, but also for any subintegration defined by holding any one or more linear
combinations of the loop momenta fixed.
Question 178. Explain why the superficial degree of divergence of a graph is
X
X
X
D =4−
Ef (sf + 1) −
Ni ∆i , where ∆i = 4 − di −
nif (sf + 1).
i
f
f
Answer 178. First, we introduce notation. Let there be interactions of types i, fields of type f ,
and let there be nif of the field f involved in the interaction i. Let the number of derivatives in
the interaction of type i be di . The asymptotic behavior of the propagator for field f is taken to
be
∆f (k) ∼ k −2+2sf as k → ∞.
For scalar fields, sf = 0. For the electron and photons, sf = 21 and sf = 0, respectively. The
degree of divergence is
X
X
X
X
D=
If (2sf + 2) +
Ni di + 4
If − (
Ni ) + 1 .
i
f
i
f
The last term comes from the momentum δ-functions in the interactions. This turns into the
answer above.
Question 179. Describe the importance of ∆i for the interaction of type i.
Answer 179. ∆i describes the scaling of the interaction of type i and turns out to be the mass
dimension of the coupling constant (see next question). If we think about
X
X
D =4−
Ef (sf + 1) −
Ni ∆i ,
i
f
it’s apparent that as we increase the number of vertices, D will decrease for ∆i > 0 and increase
for ∆i < 0. Therefore we introduce the following vocabulary:
107
Jimmy Qin
Notes on Weinberg’s QFT
• An interaction with ∆i > 0 is called superrenormalizable.
• An interaction with ∆i ≥ 0 is called renormalizable.
• An interaction with ∆i < 0 is called nonrenormalizable.
A theory in which ∆i ≥ 0 for all i is termed renormalizable. For any theory, there are only a
finite number of interactions i which are renormalizable. Usually we can always write all of them
down.
For intuition on this point, see http://math.ucr.edu/home/baez/renormalizability.html.
Question 180. Relate Weinberg’s notation with the sf and ∆i and whatnot to the mass dimensions of the fields and couplings.
Answer 180. Actually Weinberg’s method is more rigorous than the mass dimension method.
R
But for certain things, either is fine. To get the mass dimensions, remember that S = x L is
dimensionless, so [L ] = 4. Then we get the following:
• For a field f , [f ] = sf + 1.
• For an interaction g of generic form
nim
,
Lint = gi pdi f1ni1 · · · fm
we find that
[gi ] = ∆i .
12.2
Cancellation of divergences
In this section, we will make rigorous the idea that a renormalizable theory, with ∆i ≥ 0 for all
i, requires only a finite number of counterterms. We will finally explain why the counterterms
are formally divergent, i.e. why (Z2 − 1) has to have an infinite part. The explanation is very
interesting.
Question 181. Describe why a Feynman diagram with superficial degree of divergence D can be
written as a polynomial of order D in external momenta with divergent coefficients, plus a finite
remainder.
Answer 181. Roughly speaking, the integral is like
Z
dkk D−1 .
We can make this convergent by differentiating D+1 times with respect to any external momentum
(i.e. not k). In fact, this is the idea behind “derivative regularization;” see Schwartz QFT.
(Actually, there are technicalities here and this might not always work.) Then we integrate again
and get a polynomial in the external momenta.
108
Jimmy Qin
Notes on Weinberg’s QFT
Let us do an example to see it in action.
Z ∞
Z ∞
kdk
d2
kdk
1
=
= .
2
2
dq 0 k + q
(k + q)
2q
0
Now we integrate twice:
Z
Z
1
1
1
1
= ln q + C and
ln q + C = a + bq + q ln q.
2
2
q 2q
q 2
Here, a and b are divergent constants. They are independent of both k and q!
Question 182. Explain why it is kosher to add an infinite counterterm to cancel the divergent
constants mentioned in the previous question.
Answer 182. A polynomial term in external momenta is exactly what could be produced by
adding another interaction to the Lagrangian, with the right derivatives to get those momenta,
etc. But how do we know this ad-hoc procedure is justified?
“All that we ever measure is the sum of the bare coupling constant and the corresponding coefficient
from one of the divergent polynomials, so if we demand that the sum equals the (presumably finite)
measured value, then the bare coupling must automatically contain an infinity that cancels the
infinity from the divergent integral over internal momenta.”
“One qualification: where the divergence occurs in a graph or subgraph with just two external lines,
which appears as a radiative correction to a particle propagator, we must demand not that some
effective coupling constant equals its measured value, but rather that the complete propagator has
a pole at the same position and with the same residue as for free particles.”
The bare coupling contains the infinity. The renormalized coupling doesn’t contain the infinity;
instead, the infinity goes to the counterterm.
About the second point that Weinberg makes: you can’t “measure” a propagator because, well,
you need an external source interacting with a particle to measure anything, and then we wouldn’t
have just a propagator anymore (for example, this is how we measured the magnetic moment of
the electron in the previous chapter). By “free particles,” Weinberg does not mean those with
bare mass. The free particles have the renormalized mass. Weinberg just calls anything with the
pole structure
1
2
k − m2
a “free particle.” See section 10.3 of these notes.
The most rigorous way of eliminating superficial divergences is called the BPHZ prescription.
It looks boring.
12.3
Is renormalizability necessary?
There is a lot of deep physics in this section. I will think about it carefully.
Question 183. Does the QED Lagrangian contain all Lorentz- and gauge-invariant terms?
109
Jimmy Qin
Notes on Weinberg’s QFT
Answer 183. No! The term
ψ̄[γµ , γν ]ψF µν
is both Lorentz- and gauge-invariant. However, it is not renormalizable. Experimentally, this
would affect the magnetic moment of the electron, since it is roughly like ψ̄ψA. It seems that
nonrenormalizable terms are not allowed in QED. In fact, people historically arrived at the theories
of weak and strong interactions by exclusing nonrenormalizable terms. But how do we know this
procedure is valid? Weinberg’s viewpoint is that the requirement of renormalizability is not a
fundamental principle, but rather something which is a very good approximation at low energies.
Question 184. Explain why requiring theories to be renormalizable is a good approximation at
low energy scales.
Answer 184. This is a strange argument, but here it goes. We know from section 12.1 of these
notes that
[gi ] = ∆i .
Renormalizable theories are those for which ∆i > 0. We guess that the behavior of the coupling
constants in all theories in the standard model, etc. are basically like
gi ∼ M ∆i ,
where M is some very large common mass. Indeed, people think this is actually the case. I
learned this from Girma:
Figure
1:
Picture
popular-information/
from
https://www.nobelprize.org/prizes/physics/2004/
The more I think about it, the more outrageous this “common mass” claim becomes. Anyway,
suppose ∆i < 0. We know gi ∼ M ∆i , and we can argue there has to be a k-dependence coming in
too, so the effect of this interaction looks like
(
M ∆i
) .
k
√
k is on the order of the masses in the problem, usually, since k 2 ∼ E = γmc2 . The electron
mass is 0.511 MeV. Above, it looks like M = 1015 GeV. That is a big difference and suppresses
the effect of the interaction by a lot.
For the renormalizable interactions, on the other hand, I think there is no M ∆i involved in the
size of the coupling constant. See the second problem solved in this chapter.
110
Jimmy Qin
Notes on Weinberg’s QFT
Question 185. Describe how nonrenormalizable interactions could be detected, even if their effect
is strongly suppressed.
Answer 185. If the nonrenormalizable interaction breaks a symmetry of the low-energy theory,
then maybe you could detect a symmetry breaking. Apparently people think that conservation of
baryon and lepton number is violated by the small effects of nonrenormalizable interactions. Since
these symmetries of the low-energy theory are not symmetries of the fundamental Lagrangian,
whatever the fundamental theory is, they are termed accidental symmetries. Humorously,
most of the experimentally discovered symmetries of particle physics are accidental.
Gravitation can be detected because gravitational fields add up rather than shielding each other.
From this viewpoint, the low-energy theory is an effective field theory. If we wanted to, we could
include some correction terms by expanding in powers of k/M .
Question 186. Explain why you can still calculate things in a nonrenormalizable field theory.
Answer 186. Suppose I have a nonrenormalizable field theory L . For simplicity, there is only
one interaction with coupling constant g, so I can split it like
L = L0 + Lint .
If I want to calculate things to O(g n ), I will have a finite number of diagrams. Therefore, I will
have a finite number of counterterms to add to the Lagrangian. This can always be done.
It’s important to note that the convergence of high-loop terms in perturbation theory depends
on the size of the coupling g, and not on whether the interaction is renormalizable. Therefore,
we can always calculate things to high precision even in nonrenormalizable theories. It is just
that in renormalizable theories, eventually we won’t need to add any extra counterterms. In
nonrenormalizable theories, we must add some counterterms whenever we go to higher order.
12.4
Problems
Question 187. List all the renormalizable (or superrenormalizable) Lorentz-invariant terms in
the Lagrangian of a single scalar field for the following spacetime dimensionalities: 2, 3, 6.
P
Answer 187. This is a fun exercise. We will use ∆i = d − di − f nif (sf + 1) a lot. (I replaced
4 → d in the preceding equation.)
• d = 2: In d = 2, sφ + 1 = 0. Therefore any interaction with an arbitrary number of fields
and 2 or fewer derivatives is renormalizable.
• d = 3: In d = 3, sφ + 1 = 1/2. The renormalizable interactions are: 3 fields and one
derivative, three fields and no derivatives, four fields and one derivative, four fields and no
derivative, five or six fields and no derivative.
• d = 6: In d = 6, sφ + 1 = 2. The only renormalizable interaction is the one with 3 fields and
no derivatives.
111
Jimmy Qin
Notes on Weinberg’s QFT
Question 188. Suppose that the quantum electrodynamics of electrons and photons is actually
an effective field theory, derived by integrating out unknown particles of mass M m, where m
is the electron mass. Assume gauge invariance and Lorentz invariance, but not invariance under
C, P, T. What are the nonrenormalizable terms in the Lagrangian, to leading order in M1 ? Of
next to leading order?
Answer 188. The idea here is that all the terms of order O(M 0 ) should be renormalizable. The
terms of order O(M −n ), where n ≥ 1, may not. We would like to find those terms for n = 1.
For simplicity, let the particle with mass M be Ψ and the electron, of mass m, be ψ. For ideas,
see Weinberg section 12.5.
I think the most general Lagrangian we can write is
L =
13
Infrared effects
Radiative corrections include both UV and IR effects. For UV effects, we saw that we needed to
add counterterms to cancel the high-energy divergences.
The same is not true for IR effects. Very generally, the divergences in IR-divergent graphs cancel
when we add them up to get physical processes. Physically they come from photons of infinitely
large wavelength, but not massive particles of infinitely large wavelength. This is easy to see by
comparing
1
igµν
with 2
.
2
q + i
q + m2 + i
The first one diverges as q → 0; the second does not.
The main result of this chapter is the following:
The infrared divergence problem is solved with the observation that it is not really possible to
measure the rate Γβα for a reaction α → β involving definite numbers of photons of charged
particles, since photons of very low energy can always escape undetected. What can be
measured is the rate Γβα (E, ET ); where no photon has energy greater than some small E and no
more than some small energy ET goes into any number of unobserved photons.
To get to this, we will have to slog through a lot of uninteresting calculations.
13.1
Soft photon amplitudes
A soft photon is one which has very low momentum, q → 0; specifically, it should momentum
q Λ, where Λ is a typical energy scale of the scattering process. We would like to know the
amplitude for emission of a soft photon, in the following sense:
112
Jimmy Qin
Notes on Weinberg’s QFT
Figure 2: The solid lines can represent either massive particles or “hard” (i.e. q Λ)
/
photons.
The soft photons can come off of either “reactant” α or “product” β particles.
The main result of this section is that the matrix element for emitting a single soft photon with
momentum q and polarization index µ in the process α → β is given in the q → 0 limit by
X ηn en pµ
n
,
Mµβα → Mβα
p
·
q
−
iη
n
n
n
where n labels a particle in α or β, ηn = ±1 for product/reactant particles, and en is the charge
of particle n. In the case of N soft photons, this is generalized to
N X
Y
ηn en pµnr
µ1 ···µN
.
Mβα
(q1 · · · qN ) → Mβα
pn · qr − iηn r=1
n
Question 189. Describe why this is true for the case of a soft photon emitted from an electron
line.
Answer 189. Compared to the usual Mβα , the emission of a photon with momentum q from an
electron with momentum p + q (i.e. so the final electron momentum can match what is measured
– momentum p), we need to multiply Mβα by
−(2π)4 eγ µ
−i(p/ + /q) + m
−i
.
4
(2π) (p + q)2 + m2 − i
Taking q → 0 and using
−ip/ + m = 2p0
X
upσ0 ūpσ0 as well as ūpσ γ µ upσ0 = −iδσσ0
σ0
pµ
p0
µ
ep
gives the correct result, p·q−i
. You also need to use the fact that the external particle is on-shell,
2
2
so p + m = 0.
µ
n en pn
It turns out that the general structure pnη·q−iη
holds for all different kinds of particles interacting
n
with the photons. You can check it, for example, with scalar QED. Generally, the numerator of
the propagator approaches a sum of dyads of coefficient functions, which convert the new vertex
matrix into a factor proportional to pµ and a unit matrix in helicity indices.
113
Jimmy Qin
Notes on Weinberg’s QFT
µ
n en pn
Question 190. Does the contribution pnη·q−iη
receive radiative corrections?
n
Answer 190. Nope. It is on-shell, and the residue of the mass-shell poles (or the matrix element
of the electric current between states of the same particle at equal momentum, i.e. the propagator)
does not receive radiative corrections.
Question 191. Why can’t a hard photon emit a soft photon?
Answer 191. Duh. There is no photon-photon vertex in QED.
Question 192. Why don’t we count soft photons emitted from internal lines, i.e. the “circle” in
the diagram?
1
-type
Answer 192. The internal lines are not on-shell, p2 + m2 6= 0. So they will not have the p·q
divergence for q → 0 that the external lines have.
Question 193. Show the following theorem:
For spin-1 massless particles, Lorentz invariance requires the conservation of whatever coupling
constant (like electric charge) governs the interaction of these particles at low energies.
Answer 193. I was very surprised by this theorem. Here is the idea. The soft photon is an
outgoing state, so to calculate numbers we need to take
Mµ (q)βα · eµ (q, ±),
but eµ is not a four-vector. Under Lorentz transform, it can pick up a longitudinal component qµ .
If you look back at section 8.1 of these notes,
U (Λ)aµ (x)U −1 (Λ) = Λµν aν (Λx) + ∂µ Ω(x, Λ).
Therefore for Lorentz invariance to hold, we must have
X
Mµ (q)βα qµ = 0 =⇒
ηn en = 0.
n
Therefore, e must be conserved. More generally, any coupling constant must be conserved; we did
not have to use gauge invariance for this proof. Although I think it is equivalent to using gauge
invariance.
13.2
Virtual soft photons
Using the result of the previous section, we calculate to all orders the effect of radiative corrections
of virtual soft photons among the charged particle lines in the process α → β. Let the soft photons
be defined as those for which q Λ. For now, we will also constrain q with the IR cutoff q > λ.
Note the difference between Λ and λ: the first is a definition of which photons are soft; the second
is a mathematical cutoff which will be taken to 0.
114
Jimmy Qin
Notes on Weinberg’s QFT
Figure 3: Orange lines are virtual soft photons.
Question 194. Let the matrix element for the process with no soft radiative corrections be Mβα .
What is the matrix element for the process with N soft virtual photons?
Answer 194. We use the results of the previous section. It’s not terribly easy to write down, but
after you see it it becomes obvious. The matrix element gets multiplied by
N
1
1 X
en em ηn ηm Jnm
N !2N (2π)4 nm
d4 q
.
2
λ≤|q|≤Λ (q − i)(pn · q − iηn )(−pm · q − iηm )
Z
where Jnm = −i(pn · pm )
Question 195. Describe the computation of all virtual effects.
Answer 195. We take the answer to the previous question and sum over all N , giving an exponential:
1 X
λ
Λ
en em ηn ηm Jnm .
Mβα = Mβα exp
2(2π)4 nm
Here Mλβα keeps all photons with momenta greater than λ and MΛβα keeps all photons with
momenta greater than Λ. Then you have to compute Jnm . I do not find this very pleasing. The
end result is
A(α→β)
λ
λ
Γβα =
ΓΛβα .
Λ
Here Γλβα keeps all photons with momenta greater than λ and ΓΛβα keeps all photons with momenta
greater than Λ.
Apparently, A(α → β) (we always have A > 0) is this complicated
−1 X en em ηn ηm
1 + βnm m2n m2m
2
.
A(α → β) = 2
ln
, where βnm = 1 −
8π nm
βnm
1 − βnm
(pn · pm )2
It seems that the “effect of the infrared divergences introduced by soft virtual photons when
summed to all orders is to make the rate for any charged particle process α → β vanish in the
λ → 0 limit.” That is weird. Put more and more infinities together and you end up with nothing.
115
Jimmy Qin
Notes on Weinberg’s QFT
I think the ultimate idea was in the computation of
Z
d4 q
.
Jnm = −i(pn · pm )
2
λ≤|q|≤Λ (q − i)(pn · q − iηn )(−pm · q − iηm )
Obviously if you take λ = 0 initially, everything blows up. However, if you keep λ > 0 initially, then
sum everything, and then take λ → 0, then things are okay. This is like the exponential regulator
that Girma taught me, useful for finding the Fourier transform of the Coulomb potential, for
example.
13.3
Cancellation of divergences
Our claim is that if we include real soft photons (i.e. soft external photons) along with the soft
virtual photons, then we will get a cancellation of divergences. This means the scattering process
will be independent of the IR cutoff, λ.
Reference the beginning of this chapter’s notes for the notation. I am not interested in doing many
integrals, but the final result is
A(α→β)
E
ΓΛβα .
Λ
E
Λλβα (E, ET ) → F ( ; A(α → β))
ET
This is independent of λ.
How can we think about this? Essentially, we start with the process which only includes virtual
photons with IR cutoff λ. This process vanishes as λ → 0. However, unitarity says that if we
use λ as the IR cutoff for the virtual photons, we must also use it as the IR cutoff for the real
(external) processes. This gives a divergence as λ → 0, and of course it multiplies the process
which only includes virtual rates.
The process which includes virtual rates goes to 0. The multiplication due to real soft emission
goes to infinity. It turns out that these effects exactly cancel and the overall rate is independent
of the IR cutoff. This is the meaning of “cancellation of divergences.”
116
Download