Exercises in special and general relativity

Exercises in special and general relativity
IFA Challenge Track 2018/19, Module 2
Oliver Kirsebom and Thomas Tram
September 19, 2018
Solutions must be typed into LaTex, Word, or similar. Lengthy derivations can be
included as scans of hand written notes (provided the hand writing is readable) and
included in an appendix. All derivations and explanations must be self-contained, i.e.,
variables and notation must be properly introduced. You are strongly encouraged to
include graphical illustrations to support your argumentation. There are 12 problems
in total. Each problem consists of one or several sub-problems, indicated by letters
(a, b, etc.). Points are given for each sub-problem as follows: solved = 2, partially
solved = 1, not solved = 0. There are 50 sub-problems in total, and hence, the
maximum number of points is 100. To pass a minimum of 50 points is required. The
problems fall into three categories: Four-vectors and the Lorentz transformation (1–
3), kinematics in Special Relativity (4–8) and aspects of General Relativity (9-12).
The exercises are not designed to test what you already know, but they attempt to
teach you something new. This also means that if you are stuck on an exercise you
should not hesitate to ask your instructor or class-mates for a hint. The later problems
are not necessarily more difficult than the first ones, but some of them depend slightly
on notation introduced in earlier problems. We encourage you to solve at least one
sub-problem of each main-problem.
1 Four-vectors
2 Matrix formulation of the Lorentz transformation and rapidity
3 General boost
4 Particle production at colliders
5 Two-body decay
6 Three-body decay and Mandelstam variables
7 Dalitz plot
8 Angular momentum
9 Differential geometry and metric
10 Principle of extremal proper time
11 Gravity as geometry
12 Gravitational waves
Read chapter 24 in the book. Let a = (a0 , a1 , a2 , a3 ) be a vector of four components,
and define the scalar-product
a · b ≡ −a0 b0 + a1 b1 + a2 b2 + a3 b3 .
The position-time four-vector x = (ct, x, y, z) is an example of a four-vector. As you
will show below, the quantity x · x is invariant under Lorentz-tranformations. In fact,
one can define the Lorentz-transformation as the set of transformations that leave
this scalar product invariant. Any physical theory that is in accordance with special
relativity has to be constructed from objects that transform nicely (are “covariant”)
under Lorentz-transformations.
(a) Argue that x · x is invariant under Lorentz-tranformations.
(b) The energy-momentum four-vector is defined as p ≡ (E/c, p~) where E is the
energy of the particle and p~ is its (three-)momentum. What is p · p? Is it
invariant under Lorentz-transformations?
Matrix formulation of the Lorentz transformation
and rapidity
(a) Show that the Lorentz transformation, equations (6.31, 6.32, 6.21, 6.22) in Ulrik’s book, can be written as a a matrix equation
 
 0
 x0 
 0 = Λ  ,
y 
where Λ is the 4 × 4 matrix,
 −γβ
 0
0 0
0 0 
1 0 
0 1
(b) Derive the inverse transformation by isolating ct, x, y, z in equations (6.31,
6.32, 6.21, 6.22). Construct the matrix Λ̃ corresponding to this transformation.
(c) Show that Λ̃ is the matrix inverse of Λ, i.e., Λ̃Λ = I where I is the identity
(d) Introducing the rapidity φ, defined by γ = cosh φ, show that Λ can be expressed
cosh φ − sinh φ 0 0
 − sinh φ cosh φ 0 0 
Λ = Λ(φ) = 
1 0 
0 1
Notice how, in this form, the Lorentz transformation looks similar to an ordinary rotation in three-dimensional space. Thus, we may think of a Lorentz
transformation as a rotation in four-dimensional space-time about an “angle” φ.
1 cosh
and sinh are the hyperbolic cosine and sine functions, given by cosh x =
sinh x =
− e−x ).
1 x
1 x
+ e−x ) and
(e) Show that two successive Lorentz transformations along the same direction with
rapidities φ1 and φ2 is equivalent to a single Lorentz transformation with rapidity φ = φ1 + φ2 , i.e., show that
Λ(φ1 + φ2 ) = Λ(φ2 )Λ(φ1 )
Because of this additive property, rapidity can be a more useful concept than
velocity in relativistic kinematics.
General boost
The Lorentz transformation in equation (2) is called a boost in the x-direction or also
an x-boost. We shall now derive the boost-matrix for a velocity ~v in the x, y-plane.
(a) Let a vector ~b = (bx , by ) be rotated by an angle θ in the positive direction (i.e.
anti-clockwise). Show that the new coordinates (b0x , b0y ) can be written as the
following matrix multiplication:
0 bx
cos θ − sin θ bx
sin θ
cos θ
(b) Denote the rotation matrix by R(θ). Show that
R(θ)−1 = R(−θ) = R(θ)T .
(c) Now let ~v be a velocity vector in the x, y-plane, and you may assume vx > 0 for
the remainder of this problem. What is the angle from the vector ~v = (vx , vy )
to the x-axis? Remember that a rotation angle carries a sign given by the
right-hand rule.
(d) Write down the rotation matrix R that rotates ~v in such a way that it is aligned
with the x-axis. Write it in terms of vx , vy and |~v |.
(e) The Lorentz transformation Λ(~v ) in the direction ~v can now be obtained as
follows: First rotate the velocity vector ~v using R(θ) to align |~v | with the xaxis, then make an x-boost with speed |~v | and finally undo the rotation using
R(θ)−1 . In matrix notation we have:
Λ(~v ) = R(θ)Λx (|~v |) R(θ)−1 .
Do the matrix multiplication and compute Λ(~v ). Note that you should remove
the z-component from the Λx matrix.
(f) Using the previous result, how would you generalise the boost to three spatial
Particle production at colliders
Particle A hits particle B (at rest), producing particles C1 , C2 , . . . , Cn .
(a) Calculate the minimum energy Ethres required for the reaction to proceed
Pn (called
the threshold energy) in terms of the rest masses mA , mB and M ≡ i=1 mCi .
(b) Using this formula, compute the threshold energy for the following reactions,
assuming that the target proton is stationary:
(i) p + p → p + p + p + p̄
(ii) p + p → p + p + π 0
(iii) p + p → p + p + π + + π −
The rest mass of the proton and the antiproton is mp = mp̄ = 938.3 MeV/c2 ,
while the rest masses of the neutral pion and the charged pions are: mπ0 =
135.0 MeV/c2 and mπ± = 139.6 MeV/c2 . (MeV (“mega-electron-volt”) is a unit
of energy commonly used in nuclear and particle physics. 1 MeV is defined as
the energy acquired by an electron in traversing a potential difference of 106
(c) The Bevatron at the Lawrence Radiation Laboratory, Berkeley, California, was
designed to accelerate protons to a kinetic energy of 6 GeV to allow this reaction
to be observed. Owen Chamberlain and Emilio Segré received the Nobel Prize
in 1959 for producing antiprotons. Is this discovery in accordance with your
(d) Consider now the reaction p + p → p + p + p + p̄, and assume that the target
proton is not fully at rest, but moves in the opposite direction of the incoming
proton with a small momentum pt (small in the sense that p2t /2mp mp c2 ).
This will often be the case in actual experiments. Further using that the speed
of the incoming proton is close to the speed of light, show that the threshold
energy is reduced by a factor of approximately 1 − pt /mp c.
Two-body decay
Consider a particle a that decays at rest into two particles 1 and 2. The particles
have masses ma , m1 and m2 .
(a) Find the energy of the outgoing particles in terms of the particle masses.
(b) Find the magnitude of the outgoing momenta.
(c) Express the equation you found in part (b) in terms of the function
λ(x, y, z) ≡ x2 + y 2 + z 2 − 2xy − 2xz − 2yz.
[Hint: your final expression should look like . . . λ(a2 , b2 , c2 ).]
(d) Show that λ(a2 , b2 , c2 ) factorises as λ(a2 , b2 , c2 ) = (a + b + c)(a + b − c)(a − b +
c)(a − b − c).
(e) Using the result above, what is the value of |~
pB | when ma = m1 + m2 ?
(f) What happens when ma < m1 + m2 ? Comment.
Three-body decay and Mandelstam variables
Consider the three-body decay, a → 1+2+3 where particle a with mass ma decays into
particle 1, 2 and 3 with respective masses m1 , m2 and m3 . In kinematic calculations,
it is advantageous to define a set of Lorentz-invariants called Mandelstam variables.
For this decay they are defined as
s12 = −(p1 + p2 )2 /c2 ,
s23 = −(p2 + p3 )2 /c2 ,
s31 = −(p3 + p1 )2 /c2 .
(a) Show that s12 + s23 + s31 = m2a + m21 + m22 + m23 .
You will now calculate the maximum momentum of particle 3, |~
p3 |, in the rest-frame
of particle a.
(b) Write the energy conservation equation only in terms of the masses, p~3 and s12 .
(c) Using this equation show that p~3 is maximal when s12 is minimal.
(d) Let p~20 denote the momentum of particle 2 in the rest-frame of particle 1 and
express s12 in this frame. For what momentum p~20 is s12 minimal?
(e) Since s12 is Lorentz-invariant, p~20 = 0 will minimise it in any frame, so particle
1 and 2 are co-moving in this limit. What is now the maximum value of |~
p3 |?
You can either compute it directly or simply use the formula from Exercise 5
since the decay is now effectively a two-body decay.
Dalitz plot
Due to the linear relationship between2 s12 , s23 , and s31 , only two of the three variables are necessary to fully characterize a three-particle decay (the third variable can
be calculated from the first two assuming that the masses are known). In other words,
if someone gave you the values of s12 and s23 for a particular three-particle decay, you
would know everything you could possibly know about the “kinematics” of the decay.
The decay of the muon into an electron and two neutrinos
µ− → e− + νµ + ν̄e
is an example of a three-particle decay. Some of you have already studied this decay
in the lab when you measured the lifetime of the muon. If, somehow, you had been
able to measure not only when the decays took place, but also what the momenta
of the electron and the neutrinos were, you would have found that the values of s12
and s23 change from one decay event to the next, reflecting the fact that the electron
and the two neutrinos can be emitted at different angles and at different energies.
However, s12 and s23 cannot take on any values what so ever; the range of possible
values is restricted by the fact that four-momentum must be conserved.
(a) Show that (m2 + m3 )2 ≤ s23 ≤ (m − m1 )2 .
Figure 1 shows the region of allowed values of s12 and s23 (the region inside the
black contour). Six points labeled A1 , A2 , A3 and B1 , B2 , B3 have been singled
2 See
Problem 6-(a).
Figure 1: The Dalitz plot. The black contour demarks the region of allowed values of
s12 and s23 .
out. As illustrated by the little cartoons, each point corresponds to a very specific
decay scenario, e.g. the point labeled B2 at the very top of the contour corresponds
to the situation where particle 1 is at rest, while particles 2 and 3 are emitted in
opposite directions. In general, it can be shown that points on the contour correspond
to decay events with all particles emitted along the same axis. Points inside the
contour correspond to more complicated emission patterns with the particles emitted
in different directions. The two-dimensional plot shown in Figure 1, with s12 on the
first axis and s23 on the second axis, is know as a Dalitz plot, named after its inventor
R. H. Dalitz (1925–2006).
(b) Assume that the three-particle decay a → 1 + 2 + 3 proceeds in two steps:
a → 1 + b followed by b → 2 + 3. Show that s23 = mb , where mb is the mass
of the intermediate particle.
(c) Show that s23 = m2 + m3 + K20 /c2 + K30 /c2 where K20 and K30 denote the
kinetic energies of particles 2 and 3 in their common center of mass frame3 .
From (b) and (c) we conclude that in the non-relativistic limit mb ≈ m2 + m3 +
K20 /c2 + K30 /c2 , i.e. the mass of particle b equals the combined rest mass of particles 2
and 3 plus the kinetic energy of their relative motion. In general, one refers to s23
as the invariant mass of particles 2 and 3, even when they do not originate from an
intermediate particle b.
A repeated number of measurements of s12 and s23 for a three-particle decay such
as µ− → e− + νµ + νe would result in a distribution of data points in the Dalitz plot,
3 Physicists often use the term center of mass when they really mean center of momentum, i.e.
the frame in which the total momentum is zero. However, the two frames coincide as long as we are
dealing only with particles of non-zero mass (basically everything but photons).
all lying inside or on the contour shown in Figure 1, but not outside it. In some
regions there would be many data points, in other regions there would be few. The
way the data points are distributed give important clues about the decay process. Is
one particle emitted before the other two? Are they all emitted at the same time?
An article published a few years ago by the LCHb collaboration at CERN describes
a study of the decay of the so-called strange B meson into a J/ψ meson and two π
B → J/ψ + π + + π −
The collaboration observes about 7600 decay events. For each event they determine
the Mandelstam variables
s12 = −(p1 + p2 )2 /c2 ,
s23 = −(p2 + p3 )2 /c2 .
S (GeV2/c4) m2(π+23π-) (GeV 2 )
where the indices 1, 2, 3 refer to J/ψ, π + , π − , respectively. Finally, they make a Dalitz
plot of their data as shown in Figure 2. The data points are seen to be clustered around
2/c4) )
S12π ()GeV
≡ data
m2 (π +collected
π − ) versusbys12the
≡ LHCb
m2 (J/ψπ
) for B 0s candidate
Dalitz plotofofs23the
at CERN on
of BB
s mass.
the decay
of ±20
the MeV
meson into a J/ψ meson and two π mesons.
a constant value of s23 ≈ 0.98 GeV2 /c4 (the dark horizontal band).4
and thus is the same as the spin of the π + π − . Since the parent B 0s has spin-0 and the
a vector,
when theinterpretation
π + π − system forms
a spin-0
LB =this
1 and
R =about
(d)is Give
a physical
of this
What does
For π + π − resonances with non-zero spin, LB can be 0, 1 or 2 (1, 2 or 3) for LR = 1(2)
decay of the strange B meson? How does the decay proceed?
and so on. We take the lowest LB as the default.
(L )
(L )
The Blatt-Weisskopf barrier factors FB B and FR R [13] are
Angular momentum
F (0) = 1,
1 + z0
~ = ~r × p~. In (5)
, is defined as L
In classical mechanics, angular momentum
angular momentum is a “tensor of rank2 two” (basically, a two-dimensional matrix)
z + 3z0 + 9
defined as,
F (2) = √ 0
αβ z 2 +
α 3z
β +9 β α
=x p −x p
For the B meson z = r2 PB2 , where r, the hadron scale, is taken as 5.0 GeV−1 ; for the R
where, as usual, x = (x0 , x1 , x2 , x3 ) = (ct, ~x) is the position vector in space-time and
resonance z = r2 PR2 , and r is taken as 1.5 GeV−1 . In both cases z0 = r2 P02 where P0 is
p = (p0 , p1 , p2 , p3 ) = (E/c, px , py , pz ) is the four-momentum.0
the decay
daughter momentum at the pole mass, different for the B and the resonance
4 1 GeV = 109 eV = 1.60 × 10−10 joules.
The angular term, Tλ , is obtained using the helicity formalism and is defined as
Tλ = dJλ0 (θππ
8 ),
where d is the Wigner d-function [8], J is the resonance spin, θππ is the π + π − resonance
helicity angle which is defined as the angle of π + in the π + π − rest frame with respect to
Figure 3: Infinitesimal distance between two nearby points in a flat, two-dimensional
space represented in Cartesian coordinates (left) and polar coordinates (right).
(a) Show that the angular momentum tensor transforms as,
M αβ0 =
3 X
Λαγ Λβδ M γδ
γ=0 δ=0
~ = (L, 0, 0). Use the
(b) Consider a spinning wheel with angular momentum L
angular momentum tensor to determine the angular momentum measured by
an observer moving at speed v along (i) the axis of rotation, and (ii) an axis
~ and
perpendicular to the axis of rotation. [Hint: First use the definitions of L
to dentify the locations of Lx , Ly and Lz inside M . Then perform the
transformation directly.]
~ = ~r × p~, to check your
(c) Use the classical definition of angular momentum, L
results. [Hint: Divide the the wheel into infinitesimal angular segments, each
~ = ~r × d~
contributing dL
p to the total angular momentum, and perform Lorentz
transformations of ~r and d~
p separately.]
Differential geometry and metric
To specify the geometry of an N -dimensional space it is sufficient to define how we
compute the distance between nearby points. For example, in a flat, two-dimensional
space the distance between nearby points may be expressed as ds2 = dx2 + dy 2 using
the Cartesian coordinates (x, y) or as ds2 = dr2 + r2 dφ2 using the polar coordinates
(r, φ), see Fig. 3. All other geometric properties can be computed from the definition
of ds. For example, the distances along curves can be calculated by integration, and
straight lines may be defined as the curves of the shortest distance between two points.
More generally, for an N -dimensional space we may write,
ds2 =
ηαβ dxα dxβ
where (x0 , x1 , . . . , xN −1 ) is a set of coordinates and ηαβ are the entries of a N × N
matrix referred to as “the metric tensor” or simply “the metric”. To simplify the
notation we will often omit the summation symbols and simply write,
ds2 = ηαβ dxα dxβ .
(a) Show that the metric of a flat, two-dimensional space in polar coordinates is
given by,
1 0
0 r2
(b) Show that metric of the surface of a two-dimensional sphere with radius R is
given by,
0 R2 sin2 θ
where (θ, φ) are the angles of the three-dimensional polar coordinates.
(c) Show that metric of the flat, four-dimensional space-time of special relativity
(also known as Minkowski Space) is given by,
−1 0 0 0
 0 1 0 0 
 0 0 1 0 
0 0 0 1
Principle of extremal proper time
An observer watches a particle move from A to B. Let x = (ct, x, y, z) = (x0 , x1 , x2 , x3 )
be the time and the particle’s position as determined by the observer, and let τ be
the time measured by a clock attached to the particle (the particle’s proper time).
(a) Show that the travel time from A to B measured by the clock attached to the
particle, τAB = A dτ , can be expressed as,
1 B
(cṫ)2 − ẋ2 − ẏ 2 − ż 2
τAB =
c A
Let ẋ = dx
dτ denote the time-derivate of x with respect to the proper time. Now,
suppose we are givenRa function L(x, ẋ) and asked to determine which path x(τ )
maximes the integral τAB L dτ . This is not an easy problem to solve (after all, there
are infinitely many paths connecting A and B), but fortunately we have a theorem
to help us:
Theorem. The path x(τ ) that maximes the integral τAB L dτ satisfies the differential
= 0,
dτ ∂ ẋα
for all coordinates α. Eq. (18) is a very important equation in physics and is known
as the Euler-Lagrange equation.5
(b) Use the result from part (a) and the Euler-Lagrange equation (18) to show that
the path which maximizes the travel time τAB must fulfill,
ẍα = 0
for all coordinates α. You should be able to recognize Eq. (19) as the equations
of motion for a free particle. Comment on the result. What have we just shown?
5 The proof is not too difficult and can be found e.g. in Taylor, Classical Mechanics, or in Wikipedia
Gravity as geometry
In Problem 10 we derived the equations of motion for a free particle from the principle
of extremal proper time, i.e., by requiring that the path taken by the particle between
points A and B is the one that maximizes the proper time difference. (Particles are
lazy. They follow the path that lets them travel with as much leisure as possible!)
(a) Show that the travel time in Eq. (17) can be expressed more compactly as,
1 B
−ηαβ dxα dxβ
τAB =
c A
where ηαβ is the metric of flat space-time (Eq. (16)).
(b) Derive the equations of motion for a free, non-relativistic (v c) particle in a
curved space-time with the metric,
−(1 + 2Φ/c2 )
1 − 2Φ/c2
1 − 2Φ/c
1 − 2Φ/c2
where Φ(x) is the Newtonian gravitational potential, which we assume to be
weak (Φ/c2 1). Comment on the result.
Gravitational waves
Read the paper Gravitational waves on the back of an envelope by B.F. Schutz. Focus
on sections I–III and try to understand as much as you can of the derivation leading
to Eq (16). Note that in the paper vectors are written in bold face, while vector
components are written in italic with a subscript index, e.g., the vector ~y and its
components are written as y and yi , respectively.
(a) To help you visualize the problem, draw a sketch showing the vectors ~x and ~y ,
the unit vector n̂, and the length r.
(b) Consider the binary star system shown in Fig. 4. Two stars (which could be
neutron stars or black holes) of equal mass M are in orbit about each other in
the x-y plane. The orbit is circular with radius R and the orbital frequency is Ω.
An observer placed far away at ~x measures the gravitational waves emitted as
the stars spiral in on one another. For this simple system it is straightforward
to calculate quadropole tensor from Eq. (14) in Schutz’ paper. The answer is,
1 + cos(2Ωt)
1 − cos(2Ωt) 0 
I(t) = M R2 
Verify this for at least one of the entries in the matrix, for example I11 . (Treat
the stars as if they were point particles.)
(c) Use Eq. (16) in the paper to show that an observer located on the x-axis far away
from the binary star system, will measure the amplitude of the gravitational
waves as being roughly,
2GM R2 Ω2
cos [2Ω(t − |~x|/c)]
c4 |~x|
Figure 4: Binary star system.
Remember that Eq. (16) must be evaluated at the retarded time,
P Pt − |~x|/c. Also
note that summation over the indices is implied: I¨ij ni nj = i j I¨ij ni nj
(d) Use the formula for the average power (energy/time) carried by a mechanical
wave (Young & Freedman, Ch. 15) and the result from part (c), to show that the
energy carried away by the gravitational wave per unit time, L, in the direction
of the observer, satisfies,
L ∝ c−5 G−1 (GM Ω)
where the extra factors of c and G have been included to get the correct units
of energy/time. (Hint: Use the fact that the centrifugal force on each star
must be balanced by the gravitational attraction of the companion to derive
R3 = 14 GM Ω−2 , which is Kepler’s law for this binary system.) Incidentally,
Eq. (24) also holds for the total energy carried away by the gravitational wave
in all directions, but you do not have to show this.
(e) Still using Newtonian formulas, show that the total energy (kinetic + potential)
of the binary system is given by,
GM 2
= 41 M (2GM Ω)
(f) Use the results from part (d) and (e) to show that the emission of gravitational
waves causes the orbital period to decrease at the rate,
= kc−5 (GM ) Ω11/3
Figure 5: Gravitational-wave signal measured by LIGO. Credit: LIGO/Caltech/MIT.
where k is some positive, unitless constant. A correct calculation using general
relativity gives k = 41/3 48
5 .
(g) Combine the results from part (c), (d) and (e) to show that the time dependence
of the gravitational-wave signal is,
Z t
h(t) ∝ Ω(t)2/3 cos φ0 + 2
Ω(t0 )dt0
Ω(t) =
− 52 c−5 (16GM )5/3 t
and φ0 is some constant phase and Ω0 is the orbital frequency at t = 0. Note
that since the frequency changes with time, the product Ωt in Eq. (23) must
be replaced with the integral 0 Ω(t0 )dt0 (this is true for any wave when the
frequency is changing).
(h) Try different values of φ0 and Ω0 and see if you can produce a curve that
looks like the actual gravitational-wave signal
R t measured by LIGO on January
4th, 2017 (Fig. 5). Note that the integral 0 Ω(t0 )dt0 can be computed analytically. There are online tools for plotting functions which you may find helpful,
for example Wolfram-Alpha’s Plotter widget: http://www.wolframalpha.com/
Related flashcards
Create Flashcards