1 Lecture 1: Introduction, Outline and Motivation

advertisement
PHYS 652: Astrophysics
1
1
Lecture 1: Introduction, Outline and Motivation
“The most incomprehensible thing about the world is that it is comprehensible.”
Albert Einstein
Astrophysics is the branch of astronomy that deals with the physics of the Universe, including the
physical properties (luminosity, density, temperature, chemical structure) of celestial objects such
as stars, galaxies and the interstellar medium, as well as their interactions. Astrophysics is a very
broad subject: it includes mechanics, statistical mechanics, thermodynamics, electromagnetism,
relativity, particle physics, high energy physics, nuclear physics, and others.
Cosmology is theoretical astrophysics at its largest scales, where general relativity plays a major
role. It deals with the Universe as a whole — its origin, distant past, evolution, structure. When
looking at the world at such grand scales, locally “flat” and “slow” approximation — the realm of
the Newtonian mechanics — is no longer justified.
Because its subject matter involves such important and overarching questions, such as: ‘How did
we get here?’, ‘Was there a beginning?’, ‘Are we special?’, thus heavily flirting with philosophy and
theology, the modern cosmology has proven to be a dynamical battleground for competing ideas.
In this arena where greatest scientific minds (and egos!) battled, we have many instances of drama,
thrills, twists, and, of course, mystery:
• a priest-scientist breaking with the church cannons to interpret his solutions as having “a day
without yesterday” (Fr. Georges Lemaı̂tre), a progenitor term to the “Big Bang”;
• one scientist’s mockery of the opposing camp’s view immortalized (term “Big Bang” was
coined by a steady-state theory proponent Fred Hoyle);
• a “fudge factor” introduced, then discarded in embarrassment, then later reintroduced as our
only hope to get our cosmic books to balance (Einstein’s cosmological constant);
• the greatest experimental evidence for the Big Bang coming about by sheer accident! (cosmic
microwave background radiation);
• finally, we are still searching for answers so as to what comprises about 96% of the content of
the Universe. Over 70% of the mass-energy content of the Universe is in form of the unknown
vacuum energy called “dark energy”. Over 80% of the mass is in the form of the mysterious
“dark matter”.
Course Outline
This course will be composed of three parts:
1. General relativity as the foundation of cosmology
Overview of the basic concepts of the theory of general relativity (GR) and the formalism it
provides for studying the evolution of the Universe:
(a) Spacetime: time and space treated on equal footing.
(b) GR uses tools of differential geometry: metrics, covariant and contravariant tensors,
invariants. When the equations of motion are written in tensor form, they are invariant
under metric transformation.
1
PHYS 652: Astrophysics
2
(c) Geodesic equation: how particles move in curved spacetime.
(d) Einstein’s equations: how matter curves spacetime.
(e) Solutions: Friedmann-Lemaı̂tre-Robertson-Walker Universe.
(f) The horizon problem leads to inflation theory. Inflation theory also explains the observed
flatness of the Universe. De Sitter Universe.
2. Interpreting the Universe
Implications of solutions to Einstein’s equations:
(a) Brief history of time: from the Big Bang to present day.
(b) Cosmic Microwave Background (CMB) radiation.
(c) Dark matter: possible candidates and the current search.
3. Black holes, stars and galaxies:
(a) Black holes: singularities of Einstein’s equations.
(b) Stars: structure, evolution and mathematical models.
(c) Galaxies: classification, evolution and mathematical models.
Motivation: Newton vs. Einstein
Newtonian mechanics is an approximation which works quite well for most our “earthly” needs, at
least when the velocity v ≪ c, where c is the speed of light. The basic differences and analogies
between Newtonian and Einsteinian physics are presented in Table 1.
Table 1: Differences and analogies between Newtonian and Einsteinian mechanics.
Newton
absolute time and absolute space
Galilean invariance of space
(simultaneity)
existence of preferred inertial frames
(at rest or moving with constant
velocity wrt the absolute space)
infinite speed of light c
(instantaneous action at the distance)
gravity is a force
Newton’s Second Law
Poisson equation
Einstein
spacetime
Lorentz invariance of spacetime
(time-dilation, length-contraction, no simultaneity)
no preferred frames
(physics is the same everywhere)
finite and fixed speed of light c
(nothing propagates faster than c)
gravity as a distortion of the fabric of spacetime
geodesic equation
Einstein’s equations
Newtonian mechanics quickly runs into problems which cannot be explained within its realm:
• All observers measure the same speed of light c (in a vacuum), as demonstrated by MichelsonMorley experiment.
• Electromagnetism does not respect Galilean invariance.
2
PHYS 652: Astrophysics
3
• Why do all bodies experience the same acceleration regardless of their mass, i.e., why is the
inertial and gravitational mass the same (as measured experimentally throughout history)?
Einstein’s theory of special relativity (SR) introduced some revolutionary concepts:
• “Abolished” absolute time — introduced 4D spacetime as an inseparable entity.
• Finite and fixed speed of light c.
• Established equivalence between energy and mass (massless photons are subject to gravity).
• However, the 4D spacetime considered in SR is still flat — Minkowski metric.
Einstein’s theory of general relativity continued the revolution:
• Equivalence principle: Established equivalence between the inertial and gravitational mass.
• Cosmological principle: Our position is “as mundane as it can be” (on large spatial scales,
the Universe is homogeneous and isotropic).
• Relativity: Laws of physics are the same everywhere.
• New definition of gravity: Gravity is the distortion of the structure of spacetime as caused by
the presence of matter and energy. The paths followed by matter and energy in spacetime are
governed by the structure of spacetime. This great feedback loop is described by Einstein’s
field equations. So, the 4D spacetime considered in GR is no longer flat.
After establishing GR as the way to describe the Universe and learning its mathematical formalism, we will finally embark on a journey of expressing mathematically the world around us on largest
scales, physically interpreting the implications and reconciling them with the the observations.
Many of the phenomena for which we now have overwhelming evidence — the Big Bang, expanding Universe, CMB radiation, black holes, among others — have been first predicted by the
solutions of Einstein’s equations. Therefore, it is the mathematics that holds the keys to unlocking
the mysteries of the Universe, so let us begin acquiring required mathematical skills!
3
PHYS 652: Astrophysics
2
4
Lecture 2: Basic Concepts of General Relativity
“Everything should be made as simple as possible, but not simpler.”
Albert Einstein
The Big Picture: Today we are going to introduce the notation used in GR, define the metric,
compare motion in flat and curved metrics and derive the geodesic equation — an equivalent to
Newton’s Second Law in curved spacetime.
Notation
4-vector:
(t, x, y, z) → (x0 , x1 , x2 , x3 ).
Indices convention:
• Roman letters (i, j, k, l, m, n) run from 1 to 3;
• Greek letters (α, β, γ, δ, µ, ν, η, ξ) run from 0 to 3.
Einstein summation (summation over repeated indices): v ′α =
Contravariant vector transforms as A′α =
Covariant vector transforms as A′α =
∂x′α β
A
∂xβ
∂xβ
∂x′α Aβ
P3
∂x′α β
β=0 ∂xβ v
≡
∂x′α β
v .
∂xβ
(index is a superscript).
(index is a subscript).
Tensors: objects with multiple indices.
First rank (one index):
• contravariant: A′α =
• covariant: A′α =
∂x′α β
A .
∂xβ
∂xβ
∂x′α Aβ .
Second rank (two indices):
∂x′α ∂x′β ξν
A .
∂xξ ∂xν
α
β
∂x ∂x
covariant: A′αβ = ∂x
′ξ ∂x′ν Aξν .
′α
∂x ∂xν ξ
mixed: A′α
β = ∂xξ ∂x′β Aν .
• contravariant: A′αβ =
•
N th
•
rank (N indices):
1 ...αs
• mixed: Aα′αs+1
...αN =
′αs ∂xαs+1
∂x′α1
∂xαN
... ∂x
... ∂x
′βN
∂xβs ∂x′βs+1
∂xβ1
...βs
Aββ1s+1
...βN .
Operations with tensors:
αβ
αβ
• Addition: Aαβ
ξν + Bξν = Cξν .
αβ
αβ
• Subtraction: Aαβ
ξν − Bkl = Dξν .
γδ
αβγδ
• Tensor product: Aαβ
ξν Bηψ = Gξνηψ .
α
• Contraction: Aαβ
βγ = Hγ (summed over β).
νγ
αβνγ
αβγ
• Inner product: Aαβ
ξν Bδη = Pξνδη = Kξδη .
Importance: When written in tensor form, the equations of motion are invariant under
appropriately defined transformation:
• Newtonian mechanics: 3-vector (x1 , x2 , x3 ) is invariant under Galilean transformation.
4
PHYS 652: Astrophysics
5
• SR: 4-vector (x0 , x1 , x2 , x3 ) is invariant under Lorentz transformation.
• GR: 4-vector (x0 , x1 , x2 , x3 ) is invariant under general metric transformation.
Invariants: scalars which are the same in all coordinate systems.
Constants: we adopt a convention c = kB = G = ~ = 1 (to remain consistent with the book, and
also because many textbooks and papers employ these units).
Metric Tensors
Flat Euclidian space. Our common sense has taught us to think in terms of a flat space metric
(Euclidian), where parallel lines never cross and angles in a triangle always sum up to 180o , thus
strongly reinforcing our Newtonian (incorrect!) notion of absolute space. In this formulation, the
invariant line element in Cartesian coordinates of space (x1 , x2 , x3 ) is:
ds2 = (dx1 )2 + (dx2 )2 + (dx3 )2 ,
(1)
and space is assumed to be flat. Another way to write this is
ds2 = δij dxi dxj ,
(2)
where δαν is the Kronecker delta function (δαν = 1 if α = ν, δαν = 0 otherwise). Therefore, the
Euclidian flat space metric tensor for Cartesian coordinates is given by:


1 0 0
(3)
δij =  0 1 0  .
0 0 1
Invariant line element in an arbitrary coordinate system in flat space can be written in terms of
Cartesian coordinates (change of variables) as:
ds2 = δij dxi dxj = δij
∂xi ∂xj ′k ′l
dx dx ≡ pkl dx′k dx′l ,
∂x′k ∂x′l
(4)
where pkl is the space metric of the new coordinate system.
Since the indices of the metric tensor enter the eq. (4) in an identical fashion, the metric tensor
is always symmetric. Furthermore, isotropy and homogeneity (as assumed in the flat Euclidian
space) implies that the metric tensor in such a space will necessarily be diagonal.
Flat Minkowski spacetime. We can now generalize this to 4-vectors in flat spacetime (x0 , x1 , x2 , x3 ):
ds2 = ηαβ dxα dxβ ,
(5)
where ηαβ is the Minkowski (flat) spacetime metric tensor
ηαβ

−1
 0
=
 0
0
0
1
0
0
0
0
1
0

0
0 
.
0 
1
Again, isotropy and homogeneity of spacetime leads to a diagonal metric tensor.
5
(6)
PHYS 652: Astrophysics
6
Curved spacetime. For a general (possibly curved) covariant spacetime metric tensor gαβ , the
invariant line element is given by
ds2 = gαβ dxα dxβ ,
(7)
The contravariant spacetime metric tensor is simply a reciprocal of the covariant tensor gαβ :
gαβ gβν = δνα .
(8)
This implies that whenever the metric tensor is diagonal gαβ = (gαβ )−1 .
One can take inner products of tensors with the metric tensor, thus lowering or raising indices:
Aαβ = gαν Aβν .
Aαβ = gαν Aνβ ,
(9)
Expanding flat spacetime (Friedman-Lemaı̂tre-Robertson-Walker metric tensor).
The metric tensor for a flat, homogeneous and isotropic spacetime which is expanding in its spatial
coordinates by a scale factor a(t) is obtained from the Minkowski metric by scaling the spatial
coordinates by a2 (t):


−1
0
0
0
 0 a2 (t)
0
0 
.
gαβ = 
(10)
2
 0
0
a (t)
0 
0
0
0
a2 (t)
Covariant Derivative
~ given in terms of its components along the basis vectors:
Consider a vector A
~ = Aα êα .
A
(11)
~ using the Leibniz rule (f g)′ = f ′ g + g′ f , we obtain
Differentiating the vector A
~
∂êβ
∂A
∂ β ∂Aβ
=
êβ + Aβ α .
A êβ =
α
α
α
∂x
∂x
∂x
∂x
(12)
∂êβ
= Γναβ êν .
∂xα
(13)
In flat Cartesian coordinates, the basis vectors are constant, so the last term in the equation above
vanishes. However, this is not the case in general curved spaces. In general, the derivative in the
last term will not vanish, and it will itself be given in terms of the original basis vectors:
Γναβ is called the Christoffel symbol (or affine connection). It is given in terms of a metric:
1
Γναβ ≡ gνγ (gαγ,β + gγβ,α − gαβ,γ ) .
2
(14)
Taking the curvature of the ambient manifold into account when taking derivatives of vectors
or tensors yields covariant derivative:
Aα;β
Aα;β
where Aα,β ≡
∂Aα
∂xβ
and Aα,β ≡
≡ Aα,β − Γναβ Aν ,
≡
Aα,β
∂Aα
.
∂xβ
6
+
Γναβ Aν ,
(15)
(16)
PHYS 652: Astrophysics
7
For vectors Aα and Aα defined along a curve xβ = xβ (s), the covariant derivative along this
curve are
DAα
dxγ β
dxγ
dAα
DAα
dAα
≡
+ Γαβγ
A ,
≡
− Γβαγ
Aβ .
(17)
Ds
ds
ds
Ds
ds
ds
Covariant derivative is a curved spacetime analog of the ordinary derivative in Cartesian coordinates
in flat spacetime.
Principle of General Covariance states that all tensor equations valid in SR will also be valid
in GR if:
• the Minkowski metric ηαβ is replaced by a general curved metric gαβ ;
• all partial derivatives are replaced by covariant derivatives (, →;).
Examples:
dτ 2 = −ηαβ dxα dxβ =⇒ dτ 2 = −gαβ dxα dxβ ,
ηαβ uα uβ = −1 =⇒ gαβ uα uβ = −1
αβ
αβ
T,β
= 0 =⇒ T;β
=0
Geodesic Equation
In Newtonian mechanics, the Second Law states that the forces impart acceleration on the body
it acts on:
d2 ~x
1~
d2 ~x
~
=⇒
= − ∇Φ.
(18)
m 2 = F~ = −∇Φ
2
dt
dt
m
In the absence of forces acting on a body, the Second Law reduces to the First Law:
d2 ~x
= 0.
dt2
(19)
In flat Euclidian space and flat Minkowski spacetime, this also leads to straight lines.
It is a fundamental assumption of GR that, in curved spacetimes, free particles (i.e., particles
feeling no non-gravitational effects) follow paths that extremize their proper interval ds. Such paths
are called geodesics. Therefore, generalizing Newton’s laws on motion of a particle in the absence
of forces (eq. (19)) to a general curved spacetime metric leads to the geodesic equation.
Important note: Here we derive the geodesic equation using the variational principle (Lagrange’s
equations). This is an alternative to the approach presented in the textbook. Both approaches are
presented to provide a more thorough understanding — therefore they should both be studied and
understood.
Suppose the points xi lie on a curve parametrized by the parameter λ, i.e.,
xα ≡ xα (λ),
dxα =
dxα
dλ,
dλ
and the distance between two points A and B is given by
Z Br
Z B
Z B
dxα dxβ
ds
dλ =
dλ.
gαβ
ds =
sAB =
dλ dλ
A
A dλ
A
7
(20)
(21)
PHYS 652: Astrophysics
8
The shortest path between the points A and B is called the geodesic, and it is found by extremizing
(minimizing) the path sAB . This is done by standard tools of variational calculus which lead to
Lagrange equations, which we derive here as a reminder.
Extremizing the functional using a variational principle (Lagrange’s equations).
Consider
Z B dx
L λ, x,
G≡
dλ.
dλ
A
(22)
Let x = X(λ) be the curve extremizing G. Then a nearby curve passing through A and B can be
parametrized as x = X(λ) + εη(λ), such that η(A) = η(B) = 0. Extremizing eq. (22) we have:
Z B
∂L
∂L
dx
dη
dG η
+
η̇
dλ
where ẋ ≡
, η̇ ≡
=
dε ε=0
∂x
∂
ẋ
dλ
dλ
A
Z B
Z B
∂L
∂L
=
ηdλ +
η̇dλ
Now integrate by parts
∂x
A
A ∂ ẋ
Z B
Z B
d ∂L
∂L B
∂L
ηdλ +
η|A −
ηdλ
=
∂ ẋ
A dλ ∂ ẋ
A ∂x
Z B ∂L
d ∂L
η
=
−
dλ = 0
Recall : η(A) = η(B) = 0
(23)
∂x dλ ∂ ẋ
A
But the function η is arbitrary, so in order to have dG
dε ε=0 , the bracket in the integrand must
vanish, and so we arrive at Lagrange’s equations:
d ∂L
∂L
−
= 0,
∂x dλ ∂ ẋ
which can be extended to any number of phase-space coordinates:
(24)
∂L
d ∂L
−
= 0.
α
∂x
dλ ∂ ẋα
(25)
After this little side-derivation, let us march on toward the geodesic equation. We can now
apply the Lagrange’s equations to eq. (21), after using
L=
1
gγδ ẋγ ẋδ .
2
(Alternatively, one can a more traditional form for the Lagrangian: L =
matics is a lot cleaner with this choice).
After substituting eq. (26) into the eq. (25) we have
d
1
gγδ,α ẋγ ẋδ −
[gγα ẋγ ] = 0,
2
dλ
where gγδ,α ≡
∂gγδ
∂xα .
(26)
p
gγδ ẋγ ẋδ , but mathe-
(27)
After recognizing that
∂gγα δ
d
gγα =
ẋ ,
dλ
∂xδ
we obtain
1
gγδ,α ẋγ ẋδ − gγα,δ ẋδ ẋγ − gγα ẍγ =
2
1
gγδ,α − gγα,δ ẋγ ẋδ − gγα ẍγ = 0.
2
8
(28)
PHYS 652: Astrophysics
9
Multiplying by gνα , the equation simplifies to
1
gνα
gγδ,α − gγα,δ ẋγ ẋδ − ẍν = 0.
2
(29)
Recasting it to a form resembling Newton’s laws, the eq. (29) it becomes
1
ν
να
ẍ = −g
gγα,δ − gγδ,α ẋγ ẋδ ,
2
(30)
or in terms of the Christoffel symbol Γνγδ :
ẍν = −Γνγδ ẋγ ẋδ ,
(31)
(Note that going from the eq. (30) to the eq. (14), we have used that gγα,δ ẋγ ẋδ = gαδ,γ ẋγ ẋδ .)
In Euclidian space and Minkowski spacetime, gαβ is diagonal and constant so its derivatives, and
consequently the Christoffel symbol vanish, thus leaving us with straight lines, as it should.
Another advantage for using the Lagrangian in the form given in eq. (26) is that solving the
Lagrange equation in (25) in each coordinate yields the differential equation of the same form as
the geodesic equation in (31). The Christoffel symbols can then simply be read off.
Recovering Newtonian gravity. Let us verify that in the limit of slow motion (v ≪ c) and
weak, stationary gravitational fields, the geodesic equation yields Newton’s Second Law.
The limit of slow motion leads to the RHS of the eq. (31) to reduce only to Γν00 (ẋ0 )2 . But
Γν00 =
1
1
1 να
g (g0α,0 + gα0,0 − g00,α ) = − gνα g00,α = − gνi g00,i
2
2
2
(32)
because the stationary field approximation renders all gαβ,0 = 0. Using perturbation theory, recast
the metric as a small deviation from a Minkowski flat spacetime:
g αβ = η αβ − ǫαβ ,
gαβ = ηαβ + ǫαβ ,
(33)
where ǫαβ is a small perturbation. Then, to the first order in ǫαβ :
Γν00 = −
1 νi
1
η − ǫνi ǫ00,i = − η νi ǫ00,i + O(ǫ2 ).
2
2
Then Γ000 = 0 and Γj00 = − 12 η ji ǫ00,i . For ν = 0, ẍ0 =
d2 t
d2 λ
= 0 and
d2 xi
1
1
ẍ = 2 = η ji ǫ00,i (ẋ0 )2 = η ji ǫ00,i
d λ
2
2
j
But
dt dxj
dxj
=
dλ
dλ dt
Recalling that xj =
=⇒
x y z
c, c, c ,
ẍj =
d2 xj
=
dλ2
dt
dλ
2
d2 xj
dt2
dt
dλ
= const., and for ν = j
dt
dλ
2
=⇒
.
d2 xj
1
= η ji ǫ00,i .
dt2
2
(34)
(35)
(36)
and casting it in vector format we arrive to
d2 ~x
1 ~
= c2 ∇ǫ
00 .
2
dt
2
(37)
When we compare this to Newton’s Second Law
d2 ~x
~
= −∇Φ,
dt2
9
(38)
PHYS 652: Astrophysics
10
we find that ǫ00 = − 2Φ
c2 and
g00
2Φ
=− 1+ 2
c
In spherical symmetry Φ = − GM
r , so g00 = − 1 +
spacetime in the Newtonian approximation.
10
2GM
rc2
.
(39)
. This quantifies how mass curves the
PHYS 652: Astrophysics
3
11
Lecture 3: Einstein’s Field Equations
“God used beautiful mathematics in creating the world.”
Paul Dirac
The Big Picture: Last time we derived the geodesic equation (a GR equivalent of Newton’s
Second Law), which describes how a particle moves in a curved spacetime. Today we are going
to derive the second part necessary to complete the dynamical description: how the presence of
matter and energy curves the ambient spacetime. This is given by Einstein’s field equation, which
is nothing else but the GR analog of the Poisson equation.
Riemann Tensor, Ricci Tensor, Ricci Scalar, Einstein Tensor
Riemann (curvature) tensor plays an important role in specifying the geometrical properties
of spacetime. It is defined in terms of Christoffel symbols:
α
≡ Γαβδ,γ − Γαβγ,δ + Γνβδ Γανγ − Γνβγ Γανδ ,
Rβγδ
(40)
where Γαβδ,γ ≡ ∂x∂ γ Γαβδ . The spacetime is considered flat if the Riemann tensor vanishes everywhere.
Riemann tensor can also be written directly in terms of the spacetime metric
Rαβγδ ≡
1
(gβγ,αδ + gαδ,βγ − gβδ,αγ − gαγ,βδ ) + gµν Γναγ Γµβδ − gµν Γναδ Γµβγ
2
(41)
thus revealing symmetries of the Riemann tensor:
Rαβγδ = −Rβαγδ = −Rαβδγ = Rγδαβ
Rαβγδ + Rβδαγ + Rαδβγ = 0.
(42)
(43)
Because of the symmetries above, the Riemann tensor in 4-dimensional spacetime has only 20
independent components. The general rule for computing the number of independent components
is an N -dimensional spacetime is N 2 (N 2 − 1)/12.
Ricci tensor is obtained from the Riemann tensor by simply contracting over two of the indices:
γ
Rαβ ≡ Rαγβ
.
(44)
It is symmetric, which means that it has at most 10 independent quantities.
Ricci scalar is obtained by contracting the Ricci tensor over the remaining two indices:
R ≡ gαβ Rαβ = Rαα .
(45)
Bianchi identities are another important symmetry of the Riemann tensor
Rαβγδ;ν + Rβανγ;δ + Rαβδν;γ = 0,
which, after contracting, leads to
1
αβ
R;α
= gαβ R;α ,
2
(46)
(47)
which we will use shortly.
Einstein tensor is defined in terms of the Ricci tensor and Ricci scalar as
1
Gαβ ≡ Rαβ − gαβ R.
2
11
(48)
PHYS 652: Astrophysics
12
From eq. (47), a very important property of the Einstein tensor is derived
Gαβ;α = 0.
(49)
Energy-Momentum Tensor
Energy-momentum (stress-energy) tensor T αβ describes the density and flows of the 4momentum (−E, p1 , p2 , p3 ). The component T αβ is the flux or flow of the α component of the
4-momentum crossing the surface of constant xβ :
• T 00 represents energy density;
• T 0i represents the flow (flux) of energy in the xi direction;
• T i0 represents the density of the i-component of momentum;
• T ij represents the flow of the i-component of momentum in the j-direction (stress).
Figure 1: Components of the energy-momentum tensor Tαβ
The velocity at which points d and a are moving from each other is then The energy-momentum
tensor is symmetric T αβ = T βα . We now consider two types of momentum-energy tensor frequently
used in GR: dust and perfect fluid.
Dust is the simplest possible energy-momentum tensor. It is given by
T αβ = ρuα uβ .
12
(50)
PHYS 652: Astrophysics
13
For a comoving observer, the 4-velocity is given by ~u
reduces to

ρ 0 0

0 0 0
T αβ = 
 0 0 0
0 0 0
= (1, 0, 0, 0), so the stress-energy tensor

0
0 
.
(51)
0 
0
Dust is an approximation of the Universe at later times, when radiation is negligible.
Perfect fluid is a fluid that has no heat conduction or viscosity. It is fully parametrized by its
mass density ρ and the pressure P . It is given by
T αβ = (ρ + P )uα uβ + P g αβ .
For a comoving observer, the 4-velocity is given by ~u
reduces to

ρ 0 0

0 P 0
T αβ = 
 0 0 P
0 0 0
(52)
= (1, 0, 0, 0), so the stress-energy tensor

0
0 
.
(53)
0 
P
In the limit of P → 0, the perfect fluid approximation reduces to that of dust. Perfect fluid is an
approximation of the Universe at earlier times, when radiation dominates.
Conservation equations for the energy-momentum tensor T αβ are simply given by
αβ
T;β
= 0.
(54)
This expression incorporates both energy and momentum conservations in a general metric. In the
limit of flat spacetime (Minkowski metric), it reduces to
∂T αβ
= 0,
∂xβ
(55)
from which the traditional expressions for the conservation of momentum and energy are readily
recovered.
Evolution of Energy
Conservation of energy given in eq. (54) can be used to determine how components of the
energy-momentum tensor evolve with time. Following the notation in the textbook, the mixed
energy-momentum tensor is:


−ρ 0 0 0
 0 P 0 0 

Tβα = 
(56)
 0 0 P 0 .
0 0 0 P
and its conservation is given by
µ
Tν;µ
≡
∂T µν
+ Γµαµ Tνα = Γανµ Tαµ ,
∂xµ
(57)
which gives four separate equations. Consider ν = 0 component:
∂T0µ
+ Γµαµ T0α − Γα0µ Tαµ = 0.
∂xµ
13
(58)
PHYS 652: Astrophysics
14
Because of isotropy, all non-diagonal terms of T αβ vanish, so T0i = 0. This leads to µ = 0 in the
first term and α = 0 in the second term above. Thus
∂T00
+ Γµ0µ T00 − Γα0µ Tαµ = 0,
0
∂x
∂ρ
−
− Γµ0µ ρ − Γα0µ Tαµ = 0.
(59)
∂t
Expanding flat spacetime is described by the flat Friedmann-Lemaı̂tre-Robertson-Walker metric
tensor given in eq. (10):


−1
0
0
0
 0 a2 (t)
0
0 
.
gαβ = 
(60)
2
 0
0
a (t)
0 
0
0
0
a2 (t)
From the definition of the Christoffel symbol
1
Γανµ ≡ gαγ (gνγ,µ + gγµ,ν − gνβ,γ )
2
1 αγ
g (g0γ,µ + gγµ,0 − g0β,γ )
2
1 αγ
=
g gγµ,0
2
1
δαγ a−2 (2δγµ ȧa) if α 6= 0 and µ 6= 0,
2
=
0
if α = 0 or µ = 0,
(61)
Γα0µ =
because g0γ = const., g0β = const.,
because gγ0,0 = 0, g0µ,0 = 0,
so that the only non-zero Γα0µ is Γi0i = ȧ/a (note: when summed over repeated indices Γi0i = 3ȧ/a).
So, the conservation law in the expanding Universe from eq. (59) becomes
ȧ
ȧ
∂ρ
+ 3 ρ + Tαα = 0
∂t
a
a
ȧ
∂ρ
+ 3 (ρ + P )
= 0.
∂t
a
We can massage this to get
(62)
3
ρa
ȧ
a
= −3 P,
(63)
∂t
a
and use it to find out how both matter and radiation scale with expansion. For matter (dust
approximation), we have zero pressure Pm = 0, so
∂ ρm a3
= −3a2 ȧPm = 0,
(64)
∂t
which means that the energy density of matter scales as ρm ∝ a−3 . This should come as no
surprise, because the total amount of matter Mm is conserved, and the volume of the Universe goes
as V ∝ a3 , so ρm ∝ MVm ∝ a−3 .
For radiation, Pr = ρr /3, so from eq. (62) we obtain
4
∂ρr
ȧ
−4 ∂ρr a
− 4ρr = a
= 0,
∂t
a
∂t
which implies that ρr ∝ a−4 . This too should not surprise us — since radiation density is directly
proportional to the energy per particle and inversely proportional to the total volume, i.e., ρr ∝
nr ~ν
r~
∝ nλV
∝ a−4 , because λ ∝ a. The last part states that the energy per particle decreases as
V
the Universe expands.
−3 ∂
14
PHYS 652: Astrophysics
15
Einstein’s Field Equations
The stage is now set for deriving and understanding Einstein’s field equations.
The GR must present appropriate analogues of the two parts of the dynamical picture: 1) how
particles move in response to gravity; and 2) how particles generate gravitational effects. The first
part was answered when we derived the geodesic equation as the analogue of the Newton’s Second
Law. The second part requires finding the analogue of the Poisson equation
∇2 Φ(~x) = 4πGρ(~x),
(65)
which specifies how matter curves spacetime. It should also be obvious by now that all equations in
GR must be in tensor form. Arguably the most enlightening derivation of the Einstein’s equations
is to argue about its form on physical grounds, which was the approach originally adopted by
Einstein.
In Newtonian gravity, the rest mass generates gravitational effects. From SR, however, we
learned that the rest mass is just one form of energy, and that the mass and energy are equivalent.
Therefore, we should expect that in GR all sources of both energy and momentum contribute to
generating spacetime curvature. This means that in GR, the energy-momentum tensor T αβ is the
source for spacetime curvature in the same sense that the mass density ρ is the source for the
potential Φ. So, at this point, we can say that we have a pretty good idea of what the RHS of the
GR analogue of the Poisson equation should be: κT αβ (where κ is some constant to be determined
later).
What about the LHS of the GR analogue of the Poisson equation? What is analogous to
∇2 Φ(~x)? As we have seen earlier (eq. (39)), the spacetime metric in the Newtonian limit is modified
~ in the RHS
by a term proportional to Φ. If we extend this analogy, then the GR counterpart of ∇Φ
of the Newton’s Second Law should include derivatives of the metric, which is indeed verified by the
form of the geodesic equation (see eqs. (14), (31)). Further extending this analogy, one would expect
that, the GR counterpart of ∇2 Φ(~x) would contain terms which contain second derivatives of the
metric. From eq. (41), we see that the Riemann tensor Rαβγδ — and consequently its contractions
Ricci tensor Rαβ and Ricci scalar R — contain second derivatives of the metric, and thus become
viable candidates for the LHS of the Einstein’s field equation.
Lead by this line of reasoning, Einstein originally suggested that the field equation might read
Rαβ = κTαβ ,
(66)
but it was quickly recognized that this cannot be correct, because while the conservation of energyαβ
αβ
momentum require T;α
= 0, the same is in general not true of the Ricci tensor: R;α
6= 0. Fortunately, Einstein’s tensor Gαβ (a combination of Ricci tensor and Ricci scalar), satisfies the requirement that it has vanishing divergence. Therefore, Einstein’s equation then becomes
1
Gαβ ≡ Rαβ − gαβ R = κTαβ ,
2
(67)
By matching Einstein’s equation in the Newtonian limit to the Poisson equation, the constant κ is
found to be 8πG/c4 , so Einstein’s field equations become (after obeying our notation c = 1):
1
Rαβ − gαβ R = 8πGTαβ .
2
15
(68)
PHYS 652: Astrophysics
4
16
Lecture 4: The Cosmological Metric
“The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka!’
but ‘That’s funny...’ ”
Isaac Asimov
The Big Picture: Last time we derived Einstein’s equations — a GR analog to Poisson equation
— which describe how matter and radiation curve ambient spacetime. Today, we are going to
derive the Friedmann-Lemaı̂tre-Robertson-Walker metrics for both flat and curved spacetimes in
spherical coordinates, and look at the particular solutions for Universes with different contents.
The “standard model” of the Universe is founded on the Cosmological Principle which states
that our Universe is — at all times — homogeneous (same from point to point) and isotropic
(same view in all directions) when viewed on the large scales (galaxies, galaxy clusters, galaxy
super-clusters, etc. are considered as “local inhomogeneities”).
Consider four equally spaced observers along a line: The velocity at which points d and a are
moving from each other is then
vda = 3v ∝ Rda = 3R
=⇒
vda = HRda .
(69)
Assumption of isotropy of the standard model requires the constant H to be independent of direction
(angles of spherical coordinates)
H 6= H(θ, φ).
(70)
We therefore arrive at Hubble’s Law in vector form:
~v = H(t)~r.
(71)
Hubble “constant” (rate) H(t) is actually not a constant but is given in terms of the scale
factor a(t) as
ȧ(t)
H(t) ≡
.
(72)
a(t)
Current measurements of the Hubble rate are parametrized by h:
H0 = 100 h km sec−1 Mpc−1 =
h
= 2.133 × 10−33 h eV/~,
0.98 × 1010 years
(73)
with h ≈ 0.72 ± 0.02.
Assumption of homogeneity of the standard model requires the Universe to have the same
curvature everywhere (just like the 2D surface of a sphere has the same curvature everywhere).
Consider a 3D sphere embedded in a 4D “hyperspace”:
x1
2
+ x2
2
+ x3
16
2
+ x4
2
= a2 ,
(74)
PHYS 652: Astrophysics
17
where a is the radius of the 3D sphere. The distance between two points in 4D space is given by
dl2 = dx1
2
+ dx2
2
+ dx3
Differentiating eq. (74) and solving for dx4 , we obtain
xi dxi
dx4 = − √
,
a2 − xi xi
2
+ dx4
2
,
recall i = 1, 2, 3
(75)
(76)
so that eq. (75) now reads
2
dl = dx
In spherical coordinates
1 2
+ dx
2 2
+ dx
3 2
2
xi dxi
+ 2
.
a − xi xi
(77)
x1 = r sin θ cos φ,
x2 = r sin θ sin φ,
x3 = r cos θ,
so
dxi dxi = dr 2 + r 2 dθ 2 + (r sin θ)2 dφ2 ,
xi dxi = rdr,
xi xi = r 2 .
Finally, we obtain
dl2 =
dl2 =
r 2 dr 2
+ dr 2 + r 2 dθ 2 + (r sin θ)2 dφ2 ,
a2 − r 2
dr 2
2
2 2
2
+ r dθ + (r sin θ) dφ .
r 2
1− a
(78)
We could also have a negatively curved object (a “saddle”) with a2 ≡ −a2 , or a flat (zero curvature,
Euclidian) space with a → ∞. In literature, the short-hand notation is adopted:
dl2 =
dr 2
1−k
ds2 = −dt2 +
r 2
a
+ r 2 dθ 2 + (r sin θ)2 dφ2 ,
dr 2
1−k
r 2
a
+ r 2 dθ 2 + (r sin θ)2 dφ2 ,

 +1 positive-curvature Universe (finite, closed),
k =
0 flat Universe (infinite, open),

−1 negative-curvature Universe (infinite, open).
To isolate time-dependent term a, make the following substitution:

 a sin χ positive-curvature Universe,
r=
aχ
flat Universe,

a sinh χ negative-curvature Universe.
17
(79)
(80)
(81)
PHYS 652: Astrophysics
18
Then
where
dl2 = a2 dχ2 + Σ2 (χ) dθ 2 + sin2 θdφ2 .
(82)

 sin χ positive-curvature Universe,
Σ(χ) ≡
χ
flat Universe,

sinh χ negative-curvature Universe.
(Important note: for small χ, sin χ ≈ χ, sinh χ ≈ χ. What does it mean?)
If we introduce the “arc-parameter measure of time” (“conformal time”)
dη ≡
dt
,
a(t)
(83)
then we can express the 4D line element in terms of Friedman-Lemaı̂tre-Robertson-Walker metric:
ds2 = a2 (η) −dη 2 + dχ2 + Σ2 (χ) dθ 2 + sin2 θdφ2 .
(84)
Friedmann Equations
We can now solve Einstein’s field equations for the perfect fluid. All the calculations are done
in a comoving frame where
u0 = 1 = −u0 ,
and
ui = ui = 0.
(85)
This means that the energy-momentum tensor is given by
Tαβ = (ρ + P )uα uβ + P gαβ .
(86)
Raising an index of the Einstein’s field equation
we obtain
1
Rαβ − gαβ R = 8πGTαβ ,
2
(87)
1
Rβα − δβα R = 8πGTβα .
2
(88)
(Recall gαβ gβν = δνα ). After contracting over indices α and β, we obtain
where T ≡ Tαα ,
−R = 8πGT,
which means that Einstein’s field equation can be rewritten as
1 α
α
α
Rβ = 8πG Tβ − δβ T .
2
(89)
(90)
For the perfect fluid, it is easily found that
T = − (ρ + P ) + 4P = −ρ + 3P,
(91)
1
Rβα = 8πG (ρ + P )uα uβ + (ρ − P )δβα . .
2
(92)
so the eq. (90) becomes
18
PHYS 652: Astrophysics
19
After straightforward yet tedious calculations (which I relegate to homework), we obtain the components of the Ricci tensor:
ä
R00 = 3 ,
a
Ri0 = 0,
α
1
2
Rji =
aä
+
2
ȧ
+
2k
δβ .
a2
The t − t component of the Einstein’s equation given in eq. (92) becomes
3ä
1
= 8πG −(ρ + P ) + (ρ − P ) ,
a
2
(93)
(94)
or
4πG
(ρ + 3P ) a.
3
The i − i component of the Einstein’s equation is
1
1
2
aä + 2ȧ + 2k = 8πG (ρ − P ) ,
a2
2
ä = −
(95)
(96)
or
aä + 2ȧ2 + 2k = 4πG(ρ − P )a2 ,
(97)
The eqs. (95)-(97) are the basic equations connecting the scale factor a to ρ and P . To obtain a
closed system of equations, we only need an equation of state P = P (ρ), which relates P and ρ.
The system then reduces to two equations for two unknowns a and ρ.
It is, however, beneficial to further massage these basic equations into a set that is more easily
solved. Solving the eq. (97) for ä, we obtain
ä = 4πG(ρ − P )a −
2ȧ2 2k
+ ,
a
a
(98)
which can be combined with eq. (95) to cancel out P dependence and yield
16πGρa 2k 2ȧ2
−
−
= 0,
3
a
a
(99)
or
8πG 2
ρa .
(100)
3
When combined with the eq. (62) derived in the context of conservation of energy-momentum
tensor, and the equation of state, we obtain a closed system of Friedmann equations:
ȧ2 + k =
ȧ2 + k =
8πG 2
ρa ,
3
∂ρ
ȧ
+ 3 (ρ + P ) = 0,
∂t
a
P = P (ρ).
19
(101a)
(101b)
(101c)
PHYS 652: Astrophysics
5
20
Lecture 5: Solutions of Friedmann Equations
“A man gazing at the stars is proverbially at the mercy of the puddles in the road.”
Alexander Smith
The Big Picture: Last time we derived Friedmann equations — a closed set of solutions of
Einstein’s equations which relate the scale factor a(t), energy density ρ and the pressure P for flat,
open and closed Universe (as denoted by curvature constant k = 0, 1, −1). Today we are going to
solve Friedmann equations for the matter-dominated and radiation-dominated Universe and obtain
the form of the scale factor a(t). We will also estimate the age of the flat Friedmann Universe.
From the definition of the Hubble rate H in eq. (72)
H ≡
ȧ
a
=⇒
Ḣ = −H 2 +
(102)
ä
ä
= −H 2 1 − 2
a
H a
≡ −H 2 (1 + q) ,
(103)
we define a deceleration parameter q as
q≡−
ä
.
H 2a
(104)
Non-relativistic matter-dominated Universe is modeled by dust approximation: P = 0.
Then, from eq. (95), we have
ä 4πG
+
ρ = 0,
(105)
a
3
and, in terms of H
−H 2 q +
4πG
ρ = 0.
3
(106)
3H 2
q.
4πG
(107)
Therefore
ρ=
Then the first Friedmann equation becomes
2
8πG
k
ȧ
−
ρ = − 2,
a
3
a
k
H 2 − 2H 2 q = − 2 ,
a
(108)
so
−k = a2 H 2 (1 − 2q).
(109)
Since both a 6= 0 and H 6= 0, for flat Universe (k = 0), q = 1/2 (q > 1/2 for k = 1 and q < 1/2 for
k = −1). When combined with eq. (107), this yields critical density
ρcr =
3H 2
,
8πG
20
(110)
PHYS 652: Astrophysics
21
the density needed to yield the flat Universe. Currently, it is (see eq. (73))
2 2
1 year
h
2
3
10
3600×24×365 sec
0.98×10 years
3H0
g
g
ρcr =
=
= 1.87 × 10−29 h2
≈ 10−29
.
8πG
8π (6.67 × 10−8 cm3 g−1 s−2 )
cm3
cm3
(We used h ≈ 0.72 ± 0.02.)
It is important to note that the quantity q provides the relationship between the density of the
Universe ρ and the critical density ρcr (after combining eqs. (107) and (109)):
ρ
.
(111)
q=
2ρcr
The second Friedmann equation (eq. (101b)) for the matter-dominated Universe becomes
ρ̇ + 3ρ
ȧ
a
= 0
a3 ρ̇ + 3ρȧa2 = 0
⇒
d 3 a ρ =0
dt
⇒
a3 ρ = a30 ρ0 = const.
(112)
Radiation-dominated Universe is modeled by perfect fluid approximation with P = 31 ρ.
The second Friedmann equation (eq. (101b)) becomes
ȧ
ȧ
1
= ρ̇ + 4ρ = 0
ρ̇ + 3 ρ + ρ
3
a
a
d 4 a4 ρ̇ + 4ρȧa3 = 0
⇒
a ρ =0
⇒
a4 ρ = a40 ρ0 = const.
(113)
dt
Flat Universe (k = 0, q0 = 12 )
Matter-dominated (dust approximation): P = 0, a3 ρ = const.
The first Friedmann equation (eq. (101a)) becomes
ȧ2
8πG a0 3
=
ρ0
a2
3r
a
r
Z
da
8πGρ0 a30 1
8πGρ0 a30
2
3/2
1/2
⇒
=
a
+
K
=
t. (114)
⇒
a
da
=
dt
3
3
3
a1/2
At the Big Bang, t = 0, a = 0, so K = 0. Upon adopting convention a0 = 1, and the fact
that the Universe is flat ρ0 = ρcr , we finally have
a = (6πGρ0 )1/3 t2/3 = (6πGρcr )1/3 t2/3
1/3
1/3
9H02
3H0 2/3 2/3
3H02
2/3
2/3
t =
t =
t .
=
6πG
8πG
4
2
(115)
where we have used the eq. (110) in the second step. From here we compute the age of the
Universe t0 , which corresponds to the Hubble rate H0 and the scale factor a = a0 = 1 to be:
t0 =
Taking H0 =
h
0.98×1010 years
t0 =
2
.
3H0
(116)
and h ≈ 72, we get
2 × 0.98 × 1010 years
≈ 9.1 × 109 years ≡ 9.1 A (aeon).
3 × 0.72
21
(117)
PHYS 652: Astrophysics
22
Radiation-dominated: P = 13 ρ, a4 ρ = const.
The first Friedmann equation (eq. (101a)) becomes
⇒
ȧ2
8πG a0 4
ρ0
=
a2
3r
a
da
8πGρ0 a40 1
=
dt
3
a
⇒
Z
1
ada = a2 + K =
2
r
8πGρ0 a40
t.
3
(118)
Again, at the Big Bang, t = 0, a = 0, so K = 0, and a0 =1. Also ρ0 = ρcr . Therefore,
a=
32
πGρ0
3
1/4
t
1/2
=
32
πGρcr
3
1/4
t
1/2
=
32
3H 2
πG 0
3
8πG
1/4
t1/2 = (2H0 )1/2 t1/2 . (119)
Flat Friedmann Universe (k=0, q0=1/2)
a(t)
matter-dominated
radiation-dominated
t
Figure 2: Evolution of the scale factor a(t) for the flat Friedmann Universe.
Closed Universe (k = 1, q0 > 12 )
Matter-dominated (dust approximation): P = 0, a3 ρ = const.
The first Friedmann equation (eq. (101a)) becomes
⇒
1
8πG a0 3
ȧ2
− 2
=
ρ0
2
a
3r
a
a
da
8πGρ0 a30
=
−1 ⇒
dt
3a
Z
dt =
Z
da
q
8πGρ0 a30
3a
−1
Rewrite the integral above in terms of conformal time given in eq. (83) (dη ≡
Z
Z
da
q
dη =
,
8πGρ0 a30
2
a−a
3
22
dt
a ):
(120)
PHYS 652: Astrophysics
23
and define, after substituting a0 = 1 and using eqs. (107)-(109)
A≡
4πGρ0
q0
= H02 q0 =
.
3
2q0 − 1
(121)
Then
η − η0 =
Z
a
0
dã
√
= sin−1
2Aã − ã2
a−A
A
1
+ π.
2
(122)
But, the requirement η = 0 at a = 0 sets η0 = 0, so we have
a−A
1
= sin η − π = − cos η
⇒
a = A(1 − cos η).
A
2
(123)
Now dt = adη, so
t − t0 =
Z
adη =
Z
A(1 − cos η)dη = A
Z
(1 − cos η) dη = A(η − sin η).
(124)
But, the requirement η = 0 at t = 0 sets t0 = 0. Therefore, we finally have the dependence
of the scale factor a in terms of the time t parametrized by the conformal time η as:
q0
(1 − cos η),
(125)
a =
2q0 − 1
q0
(η − sin η).
t =
2q0 − 1
Radiation-dominated: P = 13 ρ, a4 ρ = const.
The first Friedmann equation (eq. (101a)) becomes
⇒
1
8πG a0 4
ȧ2
− 2
=
ρ0
a2
3r
a
a
da
8πGρ0 a40
=
−1 ⇒
dt
3a2
Z
dt =
Z
da
q
8πGρ0 a30
3a2
−1
Again, rewrite the integral above in terms of conformal time and quantity A1 =
Z a
dã
a
−1
√
√
=
sin
η − η0 =
.
A1
A1 − ã2
0
Again, the requirement η = 0 at a = 0 sets η0 = 0, so we have
p
a = A1 sin (η) ,
8πGρ0
3
=
2q0
2q0 −1 :
(126)
(127)
and
p
t − t0 = A1 cos (η) ,
√
The requirement η = 0 at t = 0 sets t0 = A1 , so we finally have
r
2q0
a =
sin η,
2q0 − 1
r
2q0
(1 − cos η) .
t =
2q0 − 1
23
(128)
(129)
PHYS 652: Astrophysics
24
Closed Friedmann Universe (k=1, q0>1/2)
a(t)
matter-dominated
radiation-dominated
Big Crunch
Big Crunch
t
Figure 3: Evolution of the scale factor a(t) for the closed Friedmann Universe.
In both matter- and radiation-dominated closed Universes, the evolution is cycloidal — the scale
factor grows at an ever-decreasing rate until it reaches a point at which the expansion is halted and
reversed. The Universe then starts to compress and it finally collapses in the Big Crunch.
Open Universe (k = −1, q0 < 12 )
Matter-dominated (dust approximation): P = 0, a3 ρ = const.
The first Friedmann equation (eq. (101a)) becomes
1
8πG a0 3
ȧ2
+ 2
=
ρ0
a2
3r
a
a
Z
Z
da
da
8πGρ0 a30
q
=
+1 ⇒
dt =
⇒
dt
3a
8πGρ0 a30
3a
Again, rewrite the integral above in terms of conformal time:
Z
Z
da
q
dη =
,
8πGρ0 a30
2
a+a
3
(130)
= 2qq00−1 . Then
q




s
2
Z a
a + Ã + a(2Ã + a)
dã
a
a
a
 = ln  + 1 + 2 +

p
=
= ln 
Ã
Ã
Ã
Ã
0
2Ãã + ã2
a
−1
+1 .
(131)
= cosh
Ã
take a0 = 1, and define à ≡
η − η0
+1
4πGρ0
3
But, the requirement η = 0 at a = 0 sets η0 = 0, so we have
a + Ã
= cosh η
Ã
⇒
24
a = Ã(cosh η − 1).
(132)
PHYS 652: Astrophysics
25
Now dt = adη, so
t − t0 =
Z
adη =
Z
Ã(cosh η − 1)dη = Ã
Z
(cosh η − 1) dη = Ã(sinh η − η). (133)
But, the requirement η = 0 at t = 0 sets t0 = 0. Therefore, we finally have the dependence
of the scale factor a in terms of the time t parametrized by the conformal time η as:
q0
a =
(cosh η − 1),
(134)
2q0 − 1
q0
(sinh η − η).
t =
2q0 − 1
Radiation-dominated: P = 31 ρ, a4 ρ = const.
The first Friedmann equation (eq. (101a)) becomes
1
8πG a0 4
ȧ2
+ 2
=
ρ0
a2
3r
a
a
Z
Z
da
da
8πGρ0 a40
q
=
+1 ⇒
dt =
⇒
2
dt
3a
8πGρ0 a30
3a2
+1
Again, rewrite the integral above in terms of conformal time and quantity Ã1 ≡
!
Z a
dã
a
p
η − η0 =
= sinh−1 p
0
Ã1 + ã2
Ã1
Again, the requirement η = 0 at a = 0 sets η0 = 0, so we have
q
a = Ã1 sinh η,
q
t − t0 = Ã1 cosh η,
p
The requirement η = 0 at t = 0 sets t0 = Ã1 , so we finally have
r
2q0
sinh η,
a =
1 − 2q0
r
2q0
t =
(cosh η − 1) .
1 − 2q0
8πGρ0
3
=
2q0
2q0 −1 :
(135)
(136)
(137)
(138)
Early times (small η limit): For small values of η, the trigonometric and hyperbolic functions
can be expanded in Taylor series (keeping only first two terms):
1
1
cos η = 1 − η 2 ,
sin η = η − η 3 ,
6
2
1 3
1
sinh η = η + η ,
cosh η = 1 + η 2 ,
6
2
so, to the leading term, the a and t dependence on η for the different curvatures is shown in the
table below:
Moral: at early times, the curvature of the Universe does not matter — singular behavior at
early times is essentially independent of the curvature of the Universe (k). Big Bang — “matterdominated singularity”.
25
PHYS 652: Astrophysics
26
Open Friedmann Universe (k=-1, q0<1/2)
a(t)
matter-dominated
radiation-dominated
t
Figure 4: Evolution of the scale factor a(t) for the open Friedmann Universe.
Matter-Dominated Friedmann Universes
flat
a(t)
open
closed
Big Bang
Big Crunch
t
Figure 5: Evolution of the scale factor a(t) for the flat, closed and open matter-dominated Friedmann
Universes.
Table 2: Scale factor a(t) for flat, closed and open Friedmann Universes, along with their asymptotic
behavior at early times.
curvature
k
0
1
-1
For all η
a
)1/3 t2/3
(6πGρ0
q0
2q0 −1 (1 − cos η)
q0
1−2q0 (cosh η − 1)
For small η
a
t
a(t)
t
q0
2q0 −1 (η − sin η)
q0
1−2q0 (sinh η − η)
26
∝ t2/3
∝ η2
∝ η2
∝ η3
∝ η3
∝ t2/3
∝ t2/3
∝ t2/3
PHYS 652: Astrophysics
6
27
Lecture 6: Age of the Universe
“The effort to understand the Universe is one of the very few things that lifts human life a little
above the level of farce, and gives it some of the grace of tragedy.”
Steven Weinberg
The Big Picture: Last time we solved Friedmann equations for the matter-dominated and
radiation-dominated flat, open and closed Universes and obtained the form of the scale factor
a(t). We computed the critical density needed to have a flat Universe at about 10−29 gcm−3 . We
also estimated the age of the flat Friedmann Universe to about 9 billion years. Today we are going
to combine the information discovered by observations of CMB radiation with the solutions of the
Friedmann equations to present strong evidence for an additional vacuum energy and non-baryonic
matter — dark energy and dark matter.
Age of a Matter-Dominated Friedmann Universe
At the present time, t = t0 (age of the Universe), a(t0 ) = a0 = 1 and q = q0 , so the eq. (107)
provides the link between the total current density of the Universe and the critical density:
q0 =
ρ0
.
2ρcr
(139)
Friedmann equations provide the link between the age of the Universe t0 and the present density
of the Universe, given in terms of critical density ρcr via quantity q0 (Homework set #1):

q0
−1 1−q0
1

−
for q0 < 12 ,
cosh
1
1−2q0
(1−2q0 )3/2
q0 t0 =
q0
−1 1−q0
H0  1 +
for q0 ≥ 1 .
3/2 cos
1−2q0
q0
(2q0 −1)
2
Age of the Matter-Dominated Friedmann Universe
1
0.9
-1
H0 =14 Aeons
H 0 t0
0.8
flat
0.7
2/3
0.6
open
0.5
closed
0.4
0
0.5
1
1.5
2
q0
Figure 6: Age of the matter-dominated Friedmann Universe. Note that because q0 ∝ ρ0 , higher density
implies younger Universe.
27
PHYS 652: Astrophysics
28
However, the observations, such as Wilkinson Microwave Anisotropy Probe (WMAP) finds the
age of the Universe to be
t0 = 13.7 ± 0.2A,
(140)
which would — from the graph above — imply that q0 ≈ 0, that is ρ0 ≈ 0 — there is no matter
in the Universe! But that is not the case — WMAP data also indicates that the Universe is (very)
nearly flat, so q0 = 1/2. Hmmm... Something is wrong with the matter-dominated Friedmann
Universe — it is missing most of its energy density.
Einstein’s Field Equations Revisited: Cosmological Constant
Einstein first introduced the cosmological constant Λ in his field equations in order to get
around at the time embarrassing solution — non-steady-state Universe. Einstein’s equations with
the cosmological constant had a form
1
Rαβ − gαβ R + gαβ Λ = 8πGTαβ .
2
or, alternatively
1
Rβα − R + Λ = 8πGTβα ,
2
1
Rβα − R = 8πGT̃βα ,
2
(141)
(142)
(143)
where T̃βα = Tβα − Λ and
Λ
−ρ − 8πG

0
T̃βα = 

0
0

0
Λ
P − 8πG
0
0
0
0
Λ
P − 8πG
0

0

0
.

0
Λ
P − 8πG
(144)
The new energy-momentum tensor T̃βα reveals the nature of the cosmological constant Λ — it is a
source of energy density and the inverse pressure (opposing the pressure of matter). Indeed, this
is what led to the coining of the name dark energy.
The density of dark energy does not depend on the scale factor a. The conservation law (and
also the second Friedmann equation) (eq. 62)
ȧ
∂ρ
+ 3 (ρ + P ) = 0.
∂t
a
(145)
then implies that the equation of state for the dark energy is P (ρ) = −ρ. More generally, since the
equations of state for the matter is P (ρ) = 0 and radiation P (ρ) = 13 ρ, they can all be expressed as
P (ρ) = wρ,
(146)
where the parameter w = −1 for dark energy w = 0 for matter and w = 1/3 for radiation.
Consider a mixture of matter and dark energy:
ρ = ρm + ρde = ρm0
28
a 3
0
a
+ ρde .
(147)
PHYS 652: Astrophysics
29
Define
8πG
ρm0
,
ρm0 =
2
ρcr0
3H0
ρde0
8πG
ρde0 =
.
ρcr0
3H02
Ωm0 ≡
Ωde0 ≡
(148)
Now rewrite the first Friedmann equation (eq. (101a)):
2
k
8πG
ȧ
ρ=− 2
−
a
3
a
2
a 3
ȧ
k
0
− H02 Ωde0 = − 2
− H02 Ωm0
a
a
a
(149)
Combining eqs. (109) and (111), we have
−k = a2 H 2 (1 − ΩT ),
(150)
where
ΩT ≡ 2q =
ρ
= Ωm + Ωde .
ρcr
(151)
From WMAP observations the Universe is nearly flat, so k = 0, which leads to
ΩT
=
ΩT0 = Ωm0 + Ωde0 = 1,
(152)
⇒ Ωm0 = 1 − Ωde0 ,
and, after taking a0 = 1
(153)
2
1
ȧ
= H02 (1 − Ωde0 ) 3 + Ωde0 .
a
a
(154)
Solving for ȧ, this becomes
ȧ = H0
r
1 − Ωde0
+ Ωde0 a2 ,
a
(155)
and
H 0 t0 =
=
=
Z
1
da
q
=
Z
1
a1/2 da
p
(1 − Ωde0 ) + Ωde0 a3
0
+ Ωde0 a2
h p
i1
p
2
√
ln 2
Ωde0 a3 + Ωde0 (a3 − 1) + 1 3 Ωde0
0
√
1 + Ωde0
2
√
ln √
,
3 Ωde0
1 − Ωde0
0
1−Ωde0
a
(156)
so the age of the Universe with dark energy is
t0 =
2
√
3H0 Ωde0
ln
√
1 + Ωde0
√
.
1 − Ωde0
(157)
As Ωde0 → 1, t0 → ∞, so some matter is needed to keep the age of the Universe finite. So, from
29
PHYS 652: Astrophysics
30
Age of the Universe with a Cosmological Constant
15
Ωde0=0.72
t0=13.7 Aeons
14
13.7
t0 [Aeons]
13
12
11
10
0
0.1
0.2
0.3
0.4
Ωde0
0.5
0.6
0.7
0.8
Figure 7: Age of the Universe with a cosmological constant Λ. The age of the Universe of 13.7A corresponds
to Ωde0 ≈ 0.72.
the observations we obtained the age of the Universe, and from the w model for the equation of
state of matter and dark energy, we found
ΩT0 = Ωm0 + Ωde0 = 1,
Ωde0 = 0.72
⇒
the Universe is flat
Ωm0 = 0.28,
(158)
(159)
which means that
Ωm0
× 100% = 28% of the Universe is matter,
ΩT
Ωde0
× 100% = 72% of the Universe is dark energy.
ΩT
The WMAP data also indicates that only 4% of the Universe is baryonic (normal) matter, and
that the remaining 24% is in some other still unknown form (dark matter). This means that we
are completely ignorant of what 96% of the Universe is composed of!
30
PHYS 652: Astrophysics
31
Energy Density Vs. Scale Factor
14
12
10
log10[ρ(t)/ρcr]
8
radiation
6
matter
4
2
0
dark energy (Λ)
-2
-4
1e-04
0.001
0.01
a(t)
0.1
1
today
Figure 8: Relative importance of matter, radiation and the cosmological constant Λ. The fact that today
the cosmological constant and the matter content are of the same order of magnitude for the first time in
the history of the Universe constitutes a so-called cosmological coincidence problem.
31
PHYS 652: Astrophysics
7
32
Lecture 7: Cosmic Distances
“Science never solves a problem without creating ten more.”
George Bernard Shaw
The Big Picture: Last time we introduced the dark energy as the dominant driving mechanism for
the cosmic expansion. Today we are going to introduce the redshift as a consequence of expansion
of the Universe, and introduce the relevant lengths associated with an expanding Universe.
Redshift
If the wavelength of the emission line in the laboratory is λ0 and if the observed wavelength is
λ > λ0 , then the line is said to be redshifted by a fraction z (the redshift) given by
z=
λ − λ0
.
λ0
(160)
The redshift is a natural consequence of the Döppler effect — as the Universe expands at a rate a,
the wavelength of a particle scales as
λ0
,
(161)
λ=
a
which, combined with eq. (160) yields
1−a
,
a
1
a=
.
1+z
z=
(162)
(163)
Gravitational redshift is observed when a receiver is located at a higher gravitational potential
than the source. The physical explanation is that the particle loses a fraction of the energy (and
hence increases its wavelength) by overcoming the difference in the potential (climbing out of the
potential well).
Comoving Coordinates
GR states that the laws of physics are the same in any coordinates. However, some coordinates
are easier to work with then others. One such set of coordinates are comoving coordinates in
which an observer is comoving with the Hubble flow. Only for these observers in the comoving
coordinates, the Universe is isotropic (otherwise, portions of the Universe will exhibit a systematic
bias: portions of the sky will appear systematically blue- or red-shifted).
Comoving Horizon
Comoving horizon is defined as the total portion of the Universe visible to the observer. It
represents the sphere with radius equal to the distance the light could have traveled (in the absence
of interactions) since the Big Bang (t = 0). In time dt, light travels a comoving distance dη =
dx/a = cdt/a, where dx is a physical distance. After recalling convention adopted earlier c = 1,
becomes
Z t
dt′
.
(164)
η≡
′
0 a(t )
32
PHYS 652: Astrophysics
33
Figure 9: Comoving and physical distances. For an observer located at the center of the circle (stationary
in the comoving coordinates), the Universe looks isotropic and homogeneous and it expands in all directions
evenly. The comoving coordinates remain fixed, while the physical distance grows as a(t). The two distances
are related as d = ax, where d is physical and x is comoving distance.
η is called the conformal time. Because it is a monotonically increasing variable of time t, it
can be used as an independent variable when discussing the evolution of the Universe (just like
the time t, temperature T , redshift z and the scale factor a). In some approximations, eq. (164)
above can be analytically solved. For instance, in a matter-dominated Universe η ∝ a1/2 and in a
radiation-dominated Universe η ∝ a (Homework set #1).
The importance of the comoving horizon η is in the fact that, under the standard cosmological
model, the portions of the sky on our comoving horizon which are separated by more than η are not
causally connected (there has not been an “exchange of information” between these regions). This
means that, in the absence of interaction, these parts should have evolved differently and reached
different temperatures. But they are all very similar, according to a remarkable isotropy of a few
parts in 105 in the CMB radiation as measured by the WMAP probe! This is called the horizon
problem.
The only way to resolve this problem is to allow for all observable matter to have been causally
connected early in the history of the Universe.
Inflation
The most obvious way to solve the horizon problem is to allow all matter to interact, and
therefore acquire (virtually) the same statistical properties, during the brief period of exponential
expansion — inflation — immediately following the Big Bang.
Consider an epoch during which the dark energy dominates the matter density: Ωde ≫ Ωm and
ΩT ≈ Ωde , Ωm = 0. If we take k = 0, so ΩT = Ωde = 1, the eq. (154) becomes
2
ȧ
= H 2 Ωde = H 2
a
⇒
ȧ = Ha
⇒
a(t) ∝ eHt .
This corresponds to a so-called De Sitter Universe, characterized by a metric
ds2 = −dt2 + e2Ht dχ2 + χ2 (dθ 2 + sin2 θdφ2 ) .
33
(165)
(166)
PHYS 652: Astrophysics
34
We are heading toward de Sitter Universe, because the density of dark energy remains constant,
while the matter density scales as a−3 and radiation density as a−4 , which makes the dark energy
an ever-increasing part of the cosmic inventory.
The exponential expansion of the scale factor (see eq. (165)) means that the physical distance
between any two observers will eventually be growing faster than the speed of light. At that point
those two observers will, of course, not be able to have any contact anymore. Eventually, we will
not be able to observe any galaxies other than the Milky Way and a handful of others in the
gravitationally-bound Local Group cluster of galaxies.
If we consider that the expansion occurred about the time that the strong force “froze out” (at
t = tGU T ), then
1
1
H≈
≈ −36 = 1036 s−1 ,
(167)
tGUT
10
s
which is an extremely fast e-folding time, indicating staggering rate of inflation. In just a few
e-folding times, the Universe is already huge.
From eq. (150), we have
(1 − ΩT ) = −
k
a2 H 2
,
(168)
which means that ΩT → 1 very fast, regardless of the value of k (recall, we noted earlier that the
curvature is relatively unimportant early in the history of the Universe — the behavior of flat,
closed and open Universes are asymptotically identical as t → 0). It also means that after inflation
ΩT = 1 — the Universe is flat.
We are heading toward de Sitter Universe, because the density of dark energy remains constant,
while the energy density of matter drops off as a3 (see Fig. (8)).
Inflation solves the flatness problem: The WMAP showed that the Universe is flat (or at least
very nearly flat), i.e., ΩT ≈ 1. Why is this so? Why 1? Why not, say, 10−5 or 106 ? The standard
model does not provide an reasonable explanation for the flat Universe. The problem is exasperated
since the ΩT = 1, and thus the flat Universe, is the unstable fixed point. This means that if the
Universe started with ΩT = 1 exactly, it would remain so forever. If, however, the Universe was
created with any other value of ΩT , even one arbitrarily close, the separation between the value of
ΩT and 1 would grow over time, presuming only that the scale factor a grows slower then linearly
in time. Let us demonstrate this mathematically.
The first Friedmann equation (eq. (101a))
ȧ2 + k =
can be rewritten to yield
ρ=
Dividing by the critical mass
8πG 2
ρa ,
3
3
ȧ2 + k .
2
8πGa
ρcr =
3H 2
,
8πG
yields
ΩT =
ρ
ρcr
=⇒
ΩT − 1 =
ρ − ρcr
3k 8πGa2
k
=
= 2.
2
2
ρcr
8πGa 3ȧ
ȧ
It is easily seen that if for t → 0 ȧ → ∞ then ΩT − 1 → 0.
34
(169)
(170)
(171)
(172)
PHYS 652: Astrophysics
If a = a0
p
t
t0
35
, then
so that
p−1
ȧ = a0 t−p
,
0 pt
(173)
k
k
= 2 2 t2p
p2 t2(1−p) ≡ k̃t2(1−p) .
ȧ2
a0 p 0
(174)
ΩT − 1 = k̃t2(1−p) ,
(175)
Finally, we obtain
so that ΩT − 1 → 0 as t → 0 for p < 1.
ΩT − 1 → 0
ΩT − 1 → ∞
as t → 0
as t → ∞
for p < 1,
for p < 1.
(176)
This means that the magnitude of ΩT − 1 grows with increasing t. In other words, during the entire
history of the Universe over which the scale factor a scales sub-linearly, the Universe is growing
increasingly non-flat (unless ΩT is exactly equal to unity). In the language of mathematics, ΩT = 1
is an unstable fixed point for p < 1.
Equation (175) holds a clue as to how to naturally obtain a flat Universe, in accordance to
observations: change the dynamics so that ΩT = 1 is a stable fixed point. All that is required is
that the scale factor grows super-linearly (for example p > 1 in the equations above). If one allows
for a cosmological constant, so that a grows exponentially in time with a(t) = exp[Ht] (eq. (165)),
then
ȧ = HeHt ,
(177)
so that
ΩT − 1 =
3k 8πGa2
k
k
ρ − ρcr
=
= 2 = 2 e−2Ht .
ρcr
8πGa2 3ȧ2
ȧ
H
(178)
It follows that any initial deviation from unity is squashed exponentially. If, at some early time
in its history, the Universe underwent a period of exponential expansion (inflation), any initial
deviation from ΩT = 1 would be reduced to the point extremely close to unity, so much so that
even the prolonged subsequent evolution with a ∝ tp with p < 1, would not drive it appreciably
away from it. Therefore, inflation solves the flatness problem.
Distance to an Emitter
It is often useful to determine the distance between a distant emitter and us. In comoving
coordinates, the distance to an object at a scale factor a (or alternatively redshift z = 1/a − 1) is
χ≡
Z
t(0)
t(a)
dt′
=
a(t′ )
Z
a
1
da′
,
a′ 2 H(a′ )
(179)
after the change of variables da/dt = aH. For the portion of the Universe which we can observe,
which is to about z ≤ 6, the radiation which dominated early on can be ignored. For the purely
matter-dominated flat Universe, we can combine the definition of the Hubble rate H ≡ ȧ/a and
eq. (115) to obtain
2
ȧ
H= =
a
3
3H0 2/3 −1/3
t
2
2/3
3H0
t2/3
2
=
35
2
2
= H0 a−3/2 .
=
2
3t
3 3H0 a3/2
(180)
PHYS 652: Astrophysics
36
This simplifies the integral in eq. (179) to
χ
f,M D
1
(a) =
H0
Z
1
a
da′
a′ 1/2
i
2 ′ 1/2 1
2 h
χ=
a =
1 − a1/2 ,
H0
H0
a
(181)
where superscripts f and M D denote flat and matter-dominated Universe. In terms of the redshift
z eq. (181) becomes (after recalling z = 1/a − 1):
1
2
f,M D
1− √
.
(182)
χ
(z) =
H0
1+z
√
For small redshift z, 1/ 1 + z ≈ 1 − z/2, so χf (z) ≈ z/H0 . For large redshift z, χ(z) → 2/H0 .
Angular Diameter Distance
Another important distance in astronomy is the angular diameter distance. In astronomy, the
angular diameter distance is determined by measuring the angle θ subtended by an object of known
physical size l. Assuming that the angle is small, it is given by
l
dA = .
θ
(183)
To compute the angular diameter distance in an expanding Universe, we express the quantities l
and θ in comoving coordinates. The comoving size of an object of physical size l is simply l/a,
while the angle subtended in the flat Friedmann Universe is
θ=
l
a
χ(a)
so finally we have
D
df,M
= aχ =
A
,
(184)
χ
.
1+z
(185)
D
D
→ χ/z → 2/(zH0 ), so the angular diameter
≈ χ. At large z, df,M
For small redshift z, df,M
A
A
distance decreases with redshift z. This means that the in the flat Universe, objects at large
redshifts appear larger than they would at intermediate redshifts!
Luminosity Distance
In astronomy, distances can be inferred by measuring the flux from an object of known luminosity (“standard candles”). Flux and luminosity are related through
F ≡
L
,
4πd2
(186)
since the total luminosity through a spherical constant with area 4πd2 is constant. The total
luminosity is defined as the amount of energy radiated per unit time. This means L ≡ dE
dt . Assuming
that, without loss of generality, all the N photons radiated have the same frequency ν (wavelength
λ). Then the luminosity becomes L = λ~ dN
dt . In comoving coordinates λc = λ/a and the t-derivative
is replaced by η-derivative (recall dt = adη), so
L(χ) =
~ dN
~ dN
~ dN 2
=a
a=
a = La2 .
λc dη
λ dt
λ dt
36
(187)
PHYS 652: Astrophysics
37
Then the observed flux is
F =
where
La2
L
L
,
=
2 ≡
2
χ
4πχ
4πd2L
4π a
dL ≡
(188)
χ
,
a
(189)
is the luminosity distance.
All three distances discussed today — conformal, angular diameter and luminosity — are larger
in a Universe with a cosmological constant than in the one without. You will convince yourself
(and me, I hope) of this in one of the problems from your Homework set #1.
Important note: reliable measurements of these distances, when combined with accurate measurements of the redshift z can provide a constraint on the energy density of the dark energy Ωde0
(as will be discussed later in more detail).
dL
distance [1/H0]
10
χ
1
Ωde0 = 0.7
Ωde0 = 0
dA
0.1
0.1
1
10
z
Figure 10: Three distances measures in a flat expanding matter-dominated Universe (thin lines) and Universe with matter and dark energy corresponding to Ωde0 = 0.7 (thick lines). Solid lines correspond to the
comoving distance χ, dotted lines to angular diameter distance dA , and dashed lines to luminosity distance
dL .
37
PHYS 652: Astrophysics
8
38
Lecture 8: Summary of Foundations of Cosmology
“Shall I refuse my dinner because I do not fully understand the process of digestion?”
Oliver Heaviside
The Big Picture: In the past seven lectures, we introduced and reviewed the basic ideas of GR as
they pertain to the understanding of the Universe on the largest scales. We derived the equations
of GR which describe the dynamics in a curved spacetime — geodesic equation and the Einstein’s
equations. Solving Einstein’s equations, both with and without the cosmological constant, leads to
different cosmologies, which depend on both curvature — flat, closed and open — and content of
the Universe — matter, radiation and dark (vacuum) energy. Today we review these concepts.
General Relativity: Dynamics in Curved Spacetime
GR describes the dynamics in curved spacetime through two equations:
• Geodesic equation: how a particle moves in curved spacetime (GR analogy to Newton’s Second
Law in flat Euclidian space).
(190)
ẍν = −Γνγδ ẋγ ẋδ .
• Einstein’s equations: how mass and energy distort (curve) spacetime (GR analogy to Poisson
equation which describes how mass distribution creates a force field in Newtonian mechanics).
1
Rβα − R + Λ = 8πGTβα ,
2
where Λ is a cosmological constant corresponding to “vacuum” energy (dark energy).
Solving Einstein’s equations in FLRW metric
ds2 = −dt2 + a2 dχ2 + Σ2 (χ) dθ 2 + sin2 θdφ2 .
(191)
(192)
with (possibly) evolving space (through the scale factor a(t), which does not have to be timedependent a priori, leads to Friedmann’s equations:
ȧ2 + k =
8πG 2
ρa ,
3
ȧ
= 0,
a
P = P (ρ).
ρ̇ + 3 (ρ + P )
(193a)
(193b)
(193c)
We have looked at two different equations of state P = P (ρ):
• Dust approximation for matter-dominated Universe: in comoving coordinates, the matter is
approximated as stationary dust particles which produce no pressure — P = 0.
• Perfect fluid approximation for radiation-dominated Universe: the pressure induced by the
movement of relativistic particles is P = 31 ρ.
• Vacuum (dark) energy for dark energy-dominated Universe: P = −ρ.
More generally, we expressed these equations of state through a w parameter, defined as
w≡
38
P
.
ρ
(194)
PHYS 652: Astrophysics
39
Table 3: Parameter w for the equations of state in different regimes.
regime
radiation-dominated
matter-dominated
dark energy-dominated
w
1/3
0
−1
scaling with a(t)
∝ a−4
∝ a−3
∝1
Cosmology: Solutions to Friedmann’s Equations
To specify a cosmology, we use Friedmann’s equations and choose:
1. Curvature of the Universe:
• flat: k = 0,
• closed: k = +1,
• open: k = −1;
2. Equation of state (dominating regime given in Table 3).
Expanding Universe
Solving the Friedmann’s equation yields a number of different cosmologies, which we derived
and discussed in class. Some of these predict age of the Universe which is grossly wrong, leading us
to believe that the underlying assumptions were incorrect. The observations show that the Universe
is very nearly flat, so we focus on the flat k = 0 cosmology. Solving for the scale factor a(t) in
the flat Universe — without any additional a priori assumptions — we obtain that the Universe
is expanding, and that its expansion is decelerating during the radiation- and matter-dominated
epochs, and accelerating during the dark energy-dominated epoch (see Table 4). Observations also
show us what the current relative content of the Universe is — how much of the critical density is
found in radiation (about 0.005%), baryonic (about 4%) and dark matter (about 24%) and dark
energy (about 72%). Using how these different constituents scale with the scale factor a(t) (see
Table 3), we can compute when each of the constituents dominated (Fig. 12).
Table 4: Scale factor a(t) for different regimes in the flat Universe.
regime
radiation-dominated
matter-dominated
dark energy-dominated
a(t)
t1/2
∝
∝ t2/3
∝ eHt
ȧ(t)
t−1/2
∝
>0
−1/3
∝t
>0
∝ eHt > 0
39
ä(t)
expanding
expanding
expanding
−t3/2
∝
<0
4/3
∝ −t < 0
∝ eHt > 0
decelerating
decelerating
accelerating
PHYS 652: Astrophysics
40
10
1
aeq2
Λ-dom.
-1
10
matter-dominated
a(t)
10-2
-3
10
aeq
10-4
radiation-dominated
10-5
-6
10
10-9
10-8
10-7
10-6
10-5
10-4
10-3
10-2
10-1
1
10
102
t [Aeons]
Figure 11: The scale radius a(t) plotted against time t for a flat Universe. Note three different epochs
(regimes) in the history of the Universe: (1) radiation-dominated a < aeq , (2) matter-dominated aeq < a <
aeq2 , (3) dark energy-dominated a > aeq2 . The expansion — the rate of change of a(t) — during the first
two epochs is sub-linear (linear regime is shown in dashed lines), and rate of expansion of the Universe is
decreasing (decelerating expansion). The expansion — the rate of change of a(t) — during the two epochs is
exponential (and hence super-linear), which means that the rate of expansion of the Universe is increasing
(accelerating expansion).
aeq
20
aeq2
radiation
log10[ρ(t)/ρcr]
15
10
matter
5
dark energy (Λ)
0
-5
1e-06
1e-05
1e-04
0.001
0.01
a(t)
0.1
1
10
100
today
Figure 12: Three epochs in the evolution of the Universe: (1) radiation-dominated a < aeq , (2) matterdominated aeq < a < aeq2 , (3) dark energy-dominated a > aeq2 . For the preview of what processes are
occurring in each of these epochs, see Fig. 1.15 in the textbook.
40
PHYS 652: Astrophysics
9
41
Lecture 9: Cosmic Inventory I: Radiation
“Happy is he who gets to know the reasons for things.”
Virgil (70 – 19 BC; Roman poet)
The Big Picture: Last time we talked about inflation early in Universe’s history as the currentlyprevailing explanation for the horizon problem and the observed flatness of the Universe. Today
we are going to talk about the radiation contents of the Universe: photons and neutrinos, and
their relative abundances. Next time, we’ll complete this with matter content: baryonic and dark
matter. Later yet, we will talk about the dark energy.
Distribution Function of Species
The distribution function of different species is given by Bose-Einstein distribution for bosons
(particles with an integer spin, such as photons, W and Z bosons, gluons, gravitons, mesons, etc.):
fBE =
1
e(E−µ)/T
−1
,
(195)
and Fermi-Dirac distribution for fermions (particles with a half-integer spin, such as quarks,
baryons, leptons, etc.):
1
,
(196)
fF D = (E−µ)/T
e
+1
p
where E(p) = p2 + m2 and µ is the chemical potential, which is much smaller than the temperature T for almost all particles at almost all times, and can therefore be safely ignored in most
of the calculations. These distributions are for the smooth Universe, and represent a zero-order
approximation. They, therefore, do not depend on positions ~x or on the direction of the momentum
p~, but only on the magnitude of the momentum p.
The properties of species specified by the distribution function f (~x, p~) are computed by integrating quantities over the distribution function. For example, the energy density of a specie i, ρi
is given by
Z
d3 p
fi (~x, p~)E(p),
(197)
ρi = gi
(2π)3
where gi is the degeneracy of the species (for instance, gi = 2 for the photon for its spin states).
The factor 1/(2π~)3 is the consequence of Heisenberg’s uncertainty principle, which states that no
particle can be localized in a phase-space volume smaller than (2π~)3 , so this becomes the unit size
of the phase-space.
Similarly, the pressure of a specie i can be expressed as
Z
d3 p
p2
f
(~
x
,
p
~
)
.
(198)
Pi = gi
i
(2π)3
3E(p)
Entropy Density
Entropy density is defined as (when chemical potential is negligible, as is the case in almost all
cases in cosmology):
ρ+P
.
(199)
s≡
T
41
PHYS 652: Astrophysics
42
To compute how the entropy density scales with the scale factor a, rewrite the second Friedmann
equation (eq. (101b)):
ȧ
ρ̇ + 3 (ρ + P )
a
3
∂
ρa
ȧ
a−3
+3 P
∂t a
∂ (ρ + P )a3
∂P
a−3
−
∂t
∂t
= 0
= 0
= 0,
Combining the equation above with the the result (Homework set #1)
∂P
ρ+P
=
,
∂T
T
(200)
∂P ∂T
∂P
=
,
∂t
∂T ∂t
(201)
3
∂
(ρ
+
P
)a
(ρ + P )a3
∂T ρ + P
−3 ∂
−3
−
=a T
= 0.
a
∂t
∂t T
∂t
T
(202)
(ρ + P )a3
= sa3 = const.,
T
(203)
and the fact that, due to chain rule,
we obtain
The quantity in brackets is constant, so
and entropy density scales as a−3 . This results holds for total entropy density for a mixture of
species in equilibrium, even if two species have different temperatures. The importance of this
result will be obvious soon when we use it to compute the relative temperatures of neutrinos and
photons in the Universe.
Photons
The energy density due to CMB radiation can be found by using eq. (197) with the Bose-Einstein
distribution given in eq. (195):
Z
Z
d3 p
E(p)
p
d3 p
=
2
,
(204)
ργ = gγ
3
3
E/T
p/T
γ
(2π) e
(2π) e γ − 1
−1
p
where we have used gγ = 2, E(p) = p2 + m2 = p for massless photons, and neglected the chemical
potential µ. After noting that d3 p = 4πp2 dp, and making a substitution x = p/Tγ
Z ∞
Z
p3
8π 4 ∞ x3
8π
dp =
dx
T
ργ =
(2π)3 0 ep/Tγ − 1
(2π)3 γ 0 ex − 1
8π 4
8π 4 π 4
=
T
6ζ(4)
=
T
(2π)3 γ
(2π)3 γ 15
π2 4
⇒ ργ =
T ,
(205)
15 γ
where we have used the result
Z
0
∞
x3
π4
dx
=
6ζ(4)
=
.
ex − 1
15
42
(206)
PHYS 652: Astrophysics
43
We have derived earlier that the energy density of radiation scales as ργ ∝ a−4 (see eq. (65)). Since,
from eq. (205), ργ ∝ Tγ4 , we see that Tγ ∝ a−1 . This means
Tγ a = Tγ0 a0 = Tγ0
Tγ0
2.725K
⇒ Tγ =
=
,
a
a
(207)
where Tγ0 = 2.725K is the temperature of the CMB measured today (we also used a0 = 1).
In terms of the critical density ρcr , we have
π2 4 1
π 2 2.725K 4
1
ργ
=
Tγ
=
,
(208)
Ωγ ≡
ρcr
15 ρcr
15
a
8.098 × 10−11 h2 eV4
where the value for ρcr is found from the Appendix B, page 416 in the textbook. We now use the
relationship between Kelvin and eV: 11605 K = 1 eV, so the above equation becomes
ργ
π 2 2.725K 4
2.47 × 10−5
1
Ωγ =
=
=
.
(209)
ρcr
15
a
8.098 × 10−11 h2 (11605K)4
h2 a4
If we take h ≈ 0.72, then the fractional content of the Universe due to CMB radiation today is
ργ0
Ωγ |today = Ωγ0 ≡
= 4.76 × 10−5 .
(210)
ρcr0
Neutrinos
Cosmic neutrinos have not been directly observed, because they are weakly interacting particles.
Neutrinos are leptons, and hence fermions, so they are subject to Fermi-Dirac distribution.
In order to compute the relative energy density of neutrinos, we need to relate the temperature
of neutrinos to the temperature of photons in CMB radiation.
Neutrinos were once in equilibrium with the rest of the cosmic plasma. They decoupled from
the hot plasma before the annihilation of electrons and positrons when the cosmic temperature
reached roughly the electron mass. We therefore invoke an argument based on entropy density,
which we have shown to decay as a−3 (eq. (203)).
Before the annihilation (and before the decoupling of neutrinos), the plasma has a uniform
temperature of, say, T1 (also let a = a1 ). The pressure due to CMB radiation (photons) is given by
1
Pγ = ργ ,
3
(211)
so the contribution to the entropy for each spin state is (recall eq. (205) has a factor gγ = 2 reflecting
2 spin states)
4
4 1 π2 4
π2
ργ + Pγ
=
ργ =
T1 = 2 T13 .
(212)
sγ =
T1
3T1
3T1 2 15
45
Photons are bosons, and hence subject to Bose-Einstein statistics, which, as we saw in eq. (206)
leads to the integral
Z ∞
π4
x3
dx
=
6ζ(4)
=
.
(213)
IBE ≡
ex − 1
15
0
Computation of the energy density for fermions will lead to the integration over the Fermi-Dirac
distribution function, which will lead to the integral
Z ∞
x3
7
7π 4
7
IF D ≡
dx
=
ζ(4)
=
= IBE .
(214)
x+1
e
48
120
8
0
43
PHYS 652: Astrophysics
44
Therefore, the contribution of massless fermions will be 7/8 of the contribution of massless bosons.
Before the annihilation, there are the following fermions: electrons (2 spin states), positrons (2
spin states), neutrinos (3 generations and 1 spin state) and anti-neutrinos (3 generations and 1 spin
state), and the following bosons: photons (2 spin states). Therefore, before the annihilation, the
entropy density is given by the sum of all entropies of species:
π2 3
43π 2 3
7
s(a1 ) = 2 T1 2 + (2 + 2 + 3 + 3) =
T .
(215)
45
8
90 1
After annihilation, temperatures of photons and neutrinos are no longer equal. Neutrinos
decoupled slightly before the annihilation, after which their temperature Tν scales as a−1 (just
like for photons). Photons were still coupled to the plasma during the annihilation, which raised
their temperature Tγ . The electrons and positrons are annihilated – converted into high-energy
photons which quickly reach equilibrium with the other photons, effectively raising their equilibrium
temperature Tγ . The entropy density after the annihilation (at some a = a2 ) is therefore
7 3
π2
21 3
π2
3
3
2Tγ + 6Tν = 4
T + Tν .
(216)
s(a2 ) = 2
45
8
45 γ
8
But, entropy density s scales as a−3 , so
which leads to
sa3 = s(a1 )a31 = s(a2 )a32 ,
(217)
43π 2 3 3
π2
21
T1 a1 = 4
Tγ3 + Tν3 a32 .
90
45
8
(218)
Neutrino temperature scales throughout as a−1 :
T a = T1 a1 = Tν a2 ,
(219)
so
43π 2 3 3
T a
90 1 1
=
⇒
⇒
" #
Tγ 3 21
43π 2
43π 2
π2
3
3
(Tν a2 )3 ,
(T1 a1 ) =
(Tν a2 ) = 4
+
90
90
45
Tν
8
3
3
43
Tγ
Tγ
21
22
+
=
=
⇒
8
Tν
8
Tν
8
1/3
1/3
Tγ
11
4
Tν
=
=
≈ 1.4,
or
≈ 0.71.
Tν
4
Tγ
11
(220)
This means that the neutrino temperature is lower by about a factor (4/11)1/3 (about 29%) then
the CMB radiation (photon) temperature, which was heated by the annihilation of electrons and
positrons.
Now that we can relate the temperature of neutrinos Tν to the temperature of photons Tγ
(which we measure to be today to be 2.725K), we can compute the energy density of the neutrinos
(which are fermions, and hence subject to Fermi-Dirac distribution function):
Z
Z
E(p)
p
d3 p
d3 p
=
6
,
(221)
ρν = gν
3
3
E/T
p/T
(2π) e ν + 1
(2π) e ν + 1
44
PHYS 652: Astrophysics
45
p
where gν = 6 (6 flavors — νe , νµ , ντ , ν̄e , ν̄µ , ν̄τ ), E(p) = p2 + m2 = p for massless neutrinos, and
neglected the chemical potential µ. After noting that d3 p = 4πp2 dp, and making a substitution
x = p/Tν
Z ∞
Z
24π
24π 4 ∞ x3
p3
ρν =
dp =
T
dx
(2π)3 0 ep/Tν + 1
(2π)3 ν 0 ex + 1
24π 4
3π 4 7 π 4
=
T
I
=
T
F
D
ν
(2π)3
π 3 ν 8 15
7π 2 4 7π 2 4 4/3 4
T =
Tγ ,
(222)
⇒ ρν =
40 ν
40 11
or in terms of energy density of photons
4/3 2
4
π 4
7
ρν =
15
T
40
11
15 γ
21 4 4/3
ργ .
⇒ ρν =
8 11
(223)
We also have
21
Ων =
8
4
11
4/3
21
Ωγ =
8
4
11
4/3
2.47 × 10−5
1.65 × 10−5
=
,
h2 a4
h2 a4
(224)
so that the ratio of the neutrino density to the critical density today is
Ων |today ≡ Ων0 =
1.65 × 10−5
.
h2
(225)
All of the calculations above were done assuming that the neutrinos are massless. However, observations of solar neutrinos indicate that they change flavors on their way from Sun to us, which can
only happen if they have mass. The observations of atmospheric neutrinos suggest that
p at least one
neutrino has mass larger than 0.05eV. In that case, for a massive neutrino, E(p) = p2 + m2ν 6= p,
so the integral in eq. (222) becomes (with gν = 2 for one flavor of neutrinos with 2 spin states)
p
Z ∞
p2 p2 + m2ν
8π
√
ρν =
dp.
(226)
(2π)3 0 e p2 +m2ν /Tν + 1
45
PHYS 652: Astrophysics
10
46
Lecture 10: Cosmic Inventory II: Baryonic and Dark Matter
“The least deviation from the truth is multiplied later.”
Aristotle
The Big Picture: Last time we talked about the radiation contents of the Universe: photons and
neutrinos and their relative abundances. Today we are going to talk about the dark matter — its
historical background, evidence for it and its importance.
Baryonic Matter
When using the term “baryonic matter”, both baryons and electrons are implied. Electrons are
not baryons, but leptons, but given that the mass of an electron is nearly 2000 times smaller than
the mass of a proton or a neutron, electron contribution is negligible.
Unlike the energy density of CMB radiation, which can be described as a gas with a temperature
and vanishing chemical potential, the baryonic density must be directly measured. The different
methods which measure baryonic density at varying redshifts z largely agree to be about 2 − 5%
of the critical density today:
Ωb |today ≡ Ωb0 ≡
ρb0
= 0.02 − 0.05.
ρcr0
(227)
We also know that the total amount of baryonic matter is constant, so with the expanding Universe,
the fractional energy density scales as ρb ∝ a−3 , so
Ωb =
ρb0 −3
ρb
=
a = Ωb0 a−3 .
ρcr0
ρcr0
(228)
Several methods are used to gauge the baryon content of the Universe:
1. Directly observing visible matter in galaxies. It has been found that the largest contribution
comes from the gas in galaxy clusters, while stars in galaxies account for only a comparatively
small fraction. This approach estimates Ωb0 = 0.02.
2. Looking at spectra of distant galaxies, and measuring the amount of light absorption. The
amount of light absorbed quantifies the amount of hydrogen the light encounters along the
way. Baryon density is then inferred from the estimate of the amount of hydrogen. This
approach roughly estimates Ωb0 h1.5 ≈ 0.02 (Rauch et al. 1997, Astrophysical Journal, 489,7).
3. Computing the baryon content of the Universe from the anisotropies of the CMB radiation.
This approach puts fairly stringent limits on the baryon content to about Ωb0 h2 = 0.024+0.004
−0.003 .
4. Inferring the baryon content of the Universe form the light element abundances. These pin
down the baryon content to Ωb0 h2 = 0.0205 ± 0.0018.
These estimates are in fairly good agreement. They put a rough baryonic content of the Universe
at about 2 − 5% of the critical density. However, as we shall soon see, the total matter density in
the Universe is significantly higher than that, so there must be another form of matter other than
baryonic.
46
PHYS 652: Astrophysics
47
Dark Matter
The first evidence of what later was named dark matter was provided by a Swiss astrophysicist
Fritz Zwicky in 1933. He used the virial theorem to show that the observed (luminous) matter was
not nearly enough to keep Coma cluster of galaxies together.
For nearly four decades the “missing mass problem” was ignored, until Vera Rubin in the
late 1960s and early 1970s measured velocity curves of edge-on spiral galaxies to an theretofore
unprecedented accuracy. To the great astonishment of the scientific community, she demonstrated
that most stars in spiral galaxies orbit the center at roughly the same speed, which suggested that
mass densities of the galaxies were uniform well beyond the location of most of the stars. This was
consistent with the spiral galaxies being embedded in a much larger halo of invisible mass (“dark
matter halo”).
One of the oldest and most straightforward methods for estimating the matter density of the
Universe is the mass-to-light ratio technique. The average ratio of the observed mass to light of the
largest possible system is used; assuming that the sample is fair, it can be multiplied by the total
luminosity density of the Universe to obtain the total mass density ρm . Zwicky was the first to do
this with a Coma cluster, but many followed.
Evidence for dark matter: mass-to-light (M/L) ratios. Astronomical observations of individual galaxies provide us with the (line-of-sight) radial luminosity distribution I(R) and the
velocities of stars orbiting the center of the galaxy v(R). From the luminosity distribution, the
deprojected density of the luminous matter ρl (r) is computed by Abel integral:
Z
dR
1 ∞ dI
√
,
(229)
ρl (r) = −
π r dR R2 − r 2
where R denotes the projected radius (as seen in the plane of the sky), and r the spatial (deprojected) radius. From this spherical approximation to the density distribution of the galaxy, the
predicted rotation curves due to this luminous matter alone can be computed as follows:
r
m⋆ vl2
GM (r)
m⋆ M (r)
=G
⇒
vl =
,
(230)
r
r2
r
where
M (r) = 4π
Z
r
ρl (r)r 2 dr,
(231)
0
the galaxy mass enclosed within the sphere of radius r (recall Newton’s law that the force of an
isotropic massive sphere at radius r is equivalent to the force due to the point mass with mass
M (r)). The equation above is simply balancing the gravitational pull of the stars within the sphere
traced out by the rotating star and its centripetal force. This vl (r) is represented by the sum of
the contributions of gas and stars in the Fig. 13, which corresponds to the long- and short-dashed
lines.
Kinematic observations of individual stars at different radii give us what the true rotation curves
are, i.e., what the actual velocity of stars v(r) as the function of radius is. This is shown by points
in Fig. 13.
Through the measurements of mass-to-light ratios (which in the absence of dark matter is unity),
it has been demonstrated that galaxies, clusters of galaxies and super-clusters have a significant
non-luminous massive component – the dark matter.
Figure 14 shows the inferred mass-to-light ratios of many systems, ranging from galaxies to
super-clusters. The ratio was first measured on small scales, implying that the density in the
47
PHYS 652: Astrophysics
48
Figure 13: Spiral galaxy M33 (2.5 million light-years away; member of the Local Group of galaxies):
image (left) and the observed rotation curves (points) approximated by the best-fitting model (solid lines).
Luminous light contribution is from the stellar disc (short-dashed lines), and from the gas (long-dashed
lines). The contribution from the dark-matter halo dominates, especially at large radii (dot-dashed line).
Universe is far below critical. As more large-scale measurements came in, the initially linear
increase in mass-to-light ratio led some to think that eventually the trend would continue until
the critical density is reached, i.e., Ωm = ΩT = 1. However, it has been shown (see Fig. 14) that
mass-to-light ratios do not increase beyond R ≈ 1 Mpc. The leveling off in the mass-to-light ratio
occurs consistent with matter density Ωm0 ≈ 0.3. Because the total amount of matter is constant,
the fractional energy density scales as ρm ∝ a−3 , so
Ωm =
ρm0 −3
ρm
=
a = Ωm0 a−3 .
ρcr0
ρcr0
(232)
More evidence for dark matter. There are other methods which independently prove and
quantify the dark matter in the Universe. They include:
• Gravitational lensing. Direct consequence of GR: trajectory of a photon is affected by the
curvature of spacetime induced by the presence of a massive object (lens).
– Weak: small distortions in the shapes of background galaxies can be created via weak
lensing by foreground galaxy clusters. Statistical averaging of these small distortions
yields mass estimates of the cluster.
– Strong: light rays leaving a source in different directions are focused on the same spot
(the observer here on Earth) by the intervening galaxy or cluster of galaxies. It produces
multiple distorted images of the source from which the mass and shape of the lens can
be inferred. See Fig. 15.
The first application of gravitational lensing provided the first and the most notable confirmation of GR: solar eclipse in 1919 confirmed that the Sun bends light which passes near
it.
• The baryons-to-matter (baryons and dark matter) ratio in clusters of galaxies, which are the
largest known virialized objects, are likely representative of the Universe as a whole. If a
good estimate of the baryonic matter Ωb is adopted from the previously described methods,
48
PHYS 652: Astrophysics
49
Figure 14: Mass-to-light ratio as a function of scale (Bahcall, Lubin & Dorman 1995, Astrophysical Journal,
447, L81). The ratio flattens out to Ωm ≈ 0.3 on largest scales.
Figure 15: Composite image of the Bullet cluster shows distribution of ordinary matter, inferred from X-ray
emissions, in red and total mass, inferred from gravitational lensing, in blue.
49
PHYS 652: Astrophysics
50
measuring the the baryons-to-matter ratio fb ≡ Ωb /Ωm in these clusters will yield the estimate
of the fractional density of matter Ωm . The visible (baryonic) matter in clusters of galaxies
is largely in hot ionized intracluster gas, with only a small, negligible fraction in stars (about
an order of magnitude smaller). This means that the ratio fb is well-approximated by the
ratio of gas-to-matter fg , which can be measured via:
– X-ray spectrum: measure the mean gas temperature from the overall shape of the X-ray
spectrum, and the absolute value of the gas density from the X-ray luminosity.
– Sunyaev-Zeldovich effect: as the CMB radiation passes through the super-cluster whose
baryonic mass is dominated by gaseous ionized intracluster medium (ICM), a fraction
of photons inverse-Compton scatter off the hot electrons of the ICM. The intensity of
the CMB radiation is therefore diminished as compared to the unscattered CMB. This
decrease is in magnitude proportional to the number of scatterers, weighted by their
temperature.
• Anisotropies in the CMB radiation.
These independent methods, along with others not mentioned here, provide a compelling body of
evidence that the baryon density is of order of 5% of the critical density, while the total matter
density is about five times larger. This clearly states that most of the matter in the Universe must
not be baryons. It must be in some other form — dark matter.
From the standpoint of cosmology, the curvature of the Universe and the cosmic inventory, dark
matter is treated on equal footing with baryonic matter — it scales with the expanding Universe
as ρdm ∝ a−3 and contributes to the total energy density budget of the Universe.
50
PHYS 652: Astrophysics
11
51
Lecture 11: Cosmic Inventory II — continued:
Dark Matter Candidates
“The strongest arguments prove nothing so long as the conclusions are not verified by experience.
Experimental science is the queen of sciences and the goal of all speculation.”
Roger Bacon
The Big Picture: Last time we introduced the dark matter, along with its historical background,
evidence for its existence and its importance throughout the history of the Universe. Today we
present some of the leading candidates in the search for its yet unknown origin.
Baryonic Dark Matter: MACHOs
The initial mass function. Our ability to observe stars has limitations — it cuts off at some
lower level luminosity. The mass-distribution of stars as set during the process of star formation
— initial mass function (IMF) — is roughly approximated by
dn ∝ m−α d ln m,
(233)
with α ≈ 1.35 (Salpeter 1955, Astrophysical Journal, 121, 161). This and similar models are
motivated empirically. We obtain the total density due to stars down to some lowest observable
stellar mass mc by integrating:
Z
ρs =
∞
mdn.
(234)
mc
For the mass-distribution in eq. (233), the total mass density due to stars is
∞
Z ∞
Z ∞
Z ∞
m1−α
m1−α c
1−α dm
1−α
−α
=
m
m
d ln m =
ρs ∝
m dm =
=
,
m
1 − α mc
α−1
mc
mc
mc
(235)
which means that the reduction of the lower threshold of detectable stellar mass by a factor of
2 results in the stellar mass density increase of 0.51−1.35 /(1.35 − 1) = 3.64. More recent studies
have shown that the IMF flattens out to (the slope approaches α = 0) below one solar mass
(mc < M⊙ ). The uncertainties in the sub-stellar region — values of ms lower than the mass
necessary to maintain hydrogen-burning nuclear fusion reactions in the cores characteristic of stars
— are quite large, leading to our inability to accurately estimate the associated baryonic mass.
Brown dwarfs. Stars are born from self-gravitating clouds of gas. Gravitational collapse of gas
will cause the temperature to rise until nuclear burning can begin (a star is born!). The only way
that self-gravitating gas does not yield a star is if electron degeneracy sets in first and stops the
collapse. Electron degeneracy is a consequence of the Pauli exclusion principle: no two fermions
(in this case electrons) confined within a given region (in this case a star) can have the same
momentum and spin. Most of the electrons in dense matter must be in state of continual motion
which results in a pressure that increases as the matter density increases. The condition for the
onset of degeneracy is that the interparticle spacing becomes small enough for the uncertainty
principle to become important:
p ≤ ~n1/3 ,
(236)
where n is the electron number density and p is the momentum. We can crudely estimate the
condition for this to occur by assuming that the body is of uniform density and temperature. For
51
PHYS 652: Astrophysics
52
Figure 16: Baryonic dark matter candidates: brown dwarfs, white dwarfs, neutron stars and black holes.
(Dan Hooper Dark Cosmos: In Search of Our Universe’s Missing Mass and Energy, Collins, 2006).
a given mass, this yields an estimate of the maximum temperature that can be attained before
degeneracy becomes important:
Mmin
Tmax ≈ 6 × 108
K,
(237)
M⊙
where M⊙ is the Solar mass. Hydrogen fusion requires T ≈ 107 K, so the resulting minimum stellar
mass is about
(238)
Mmin ≈ 0.05M⊙ .
More accurate calculations lead to a more refined predictions of Mmin ≈ 0.08 ± 0.01M⊙ . Objects
much less massive than this will generate energy only gravitationally, and will therefore be virtually
invisible. Such objects are called brown dwarfs. These objects are very difficult to detect: their
spectra are heavily affected by broad molecular absorption bands, which are very hard to model.
The best limit on the possible contribution of low-mass objects comes from gravitational microlensing results from our own Galaxy. It is estimated that the objects below the nuclear burning
limit Mmin contribute about 20% of the dark matter in the Milky Way. It is not clear what the
contribution from brown dwarfs is elsewhere.
White dwarfs. White dwarfs form from the collapse of stellar cores once nuclear burning has
ceased there. They arise when the core remnant after the death of the star is smaller than the
Chandrasekhar mass of about 1.4M⊙ . The end in nuclear burning in these smaller stars is followed
by a “helium flash” which blows off the outer parts of the star thus creating a planetary nebula.
The remaining core contracts under its own gravity until, having reached a size similar to that of
the Earth, it becomes so dense (5 × 108 kg/m3 ) that it is supported against further collapse by the
pressure of electron degeneracy. They gradually cool, becoming fainter and redder. White dwarfs
52
PHYS 652: Astrophysics
53
may constitute about 30% of the stars in solar neighborhood, but because of their low luminosity
(typically 10−3 to 10−4 of the Sun’s) they are very inconspicuous.
Neutron stars. Neutron stars form from the collapse of stellar cores once nuclear burning has
ceased there. They arise when the core remnant after the death of the star is larger than the
Chandrasekhar mass of about 1.4M⊙ , but still smaller than about 2M⊙ . The neutron stars which
are created from core remnants with M > 2M⊙ will eventually collapse further into a black hole.
The end in nuclear burning in larger stars is followed by a “supernova”. The remaining core
contracts under its own gravity until, at a density of about 1017 kg/m3 , electrons and protons
are so closely packed that they combine to form neutrons. The resultant object, consisting only of
neutrinos, is supported against further gravitational collapse by the pressure of neutron degeneracy.
A typical neutron star, with a mass little greater than the mass of the Sun, has a diameter of only
about 30 km. (Pulsars are spinning magnetized neutron stars.)
Black holes. Black holes form from the collapse of stellar cores once nuclear burning has ceased
there. They arise when the core remnant after the death of the star is larger than 2M⊙ . When
neutron degeneracy becomes insufficient to support the neutron star from collapsing, its radius
radius shrinks to below critical size known as the Schwarzschild radius.
All of these compact (sub-)stellar objects are examples of MAssive Compact Halo Objects (MACHOs), objects which we cannot directly see. Therefore, they are a form of dark matter. However,
even with their contributions added to the visible baryonic matter, the total content of matter in
the Universe is still significantly short.
Non-Baryonic Dark Matter: WIMPs
Although we presented a strong case that at least a portion of the dark matter content is
baryonic, there exists strong cosmological evidence that the dark matter consists of weakly interacting relic particles. The strongest case is built by primordial nucleosynthesis, or Big Bang
Nucleosynthesis (BBN), which estimates that a baryonic contribution to the total energy density
is Ωb0 ≈ 0.0125h−2 . This is the contribution of the protons and neutrons that interacted to fix the
light-element abundances at t ≈ 1 minute or so.
At the time of BBN, the Universe consisted of baryons (plus electrons, which are implicitly
included in the “baryonic matter”), photons and three species of neutrinos. To account for dark
matter, one can proceed in two ways from there: (i) neutrinos have mass; or (ii) there must exist
some additional particle species that is a frozen-out relic from an earlier epoch.
A small neutrino mass would not affect the BBN, since the neutrinos are ultrarelativistic prior
to matter–radiation equality. Other relic particles would have to be either very rare or extremely
weakly coupled (even more weakly than neutrinos) in order not to effect the BBN. Either alternative
would produce the dark matter which is collisionless, which is the main argument in favor of
nonbaryonic dark matter: the clustering power spectrum appears to be free of oscillatory features
expected from the gravitational growth of perturbations in matter that is able to support sound
waves.
There are a number of different Weakly Interacting Massive Particles (WIMPs) candidates for
the dark matter particle. They are called “weakly interacting” because they interact only by weak
interaction and gravity, and are therefore notoriously difficult to detect.
• Massive neutrinos. The most obvious species of nonbaryonic dark matter to consider as
a dark matter particle candidate is a massive neutrino. Because neutrinos are very weakly
interacting, it is still unclear what the mass of neutrinos may be. Recent experiments only
53
PHYS 652: Astrophysics
54
Figure 17:
Constraint on the baryon density from the BBN. Predictions are shown for the four light
elements — 4 He, deuterium (D), 3 He and lithium (Li). The boxes represent observations. There is only an
upper limit on the primordial abundance of 3 He. (Burles, Nollett & Turner 1999, astro-ph/9903300).
54
PHYS 652: Astrophysics
55
put constraints on the difference of squares of masses of two flavors of neutrinos.
• Supersymmetric particles. Particles which are part of the theory of supersymmetry
(SUSY), and which are yet to be detected are also considered as dark matter candidates.
Among them are particles like axions and neutralinos.
There is another categorization of WIMPs, which is more descriptive of their nature: hot and cold
dark matter.
Hot dark matter (HDM). Hot dark matter particles — neutrinos — decouple when they are
relativistic, and have a number density roughly equal to that of photons. These low-mass relics
are hot in the sense of possessing large quasi-thermal velocities. These velocities were larger at
high redshifts, which resulted in major effects on the development of self-gravitating structures.
The structure forms by fragmentation — top-down — with largest super-clusters forming first in
flat sheets and subsequently fragmenting into smaller pieces to form smaller structures — clusters,
galaxies and stars.
The predictions of HDM matter strongly disagree with observations.
Cold dark matter (CDM). Most cosmologists favor the CDM theory as a description of how the
Universe went from a smooth initial state at early times (as demonstrated by the CMB radiation)
to the lumpy distribution of galaxies and clusters of galaxies that we observe today.
In the CDM theory, the structure grows hierarchically — bottom-up — with small objects
collapsing first and merging in a continuous hierarchy to form more and more massive objects —
stars, galaxies, cluster, super-clusters. The CDM clusters hierarchically with the number count
growing with the decreasing size of halos.
The predictions of CDM generally agrees with observations. There are two important discrepancies between predictions of the CDM paradigm and observations of galaxies and their clustering,
thereby creating a potential crisis for the CDM picture:
• The cuspy halo problem: CDM predicts that the central density slopes of galaxies are much
steeper than they have been observed.
• The missing satellites problem: the CDM predicts large number of small dwarf galaxies about
one thousandth the mass of the Milky Way. The number of these dwarf galaxies and their
small halos is orders of magnitude lower than expected from simulations.
55
PHYS 652: Astrophysics
12
56
Lecture 12: Cosmic Inventory III: Dark Energy
“It is far better to grasp the Universe as it really is than to persist in delusion, however satisfying
and reassuring.”
Carl Sagan
The Big Picture: For the last couple of lectures we talked about the dark matter — its historical
background, evidence for its existence, its importance for the history of the Universe, as well as
some of the leading candidates in the search for its yet unknown origin. Today we are going to
discuss the dark energy — evidence for its existence, its implication for the structure and evolution
of the Universe and some alternatives.
Dark Energy
The notion of a “cosmological constant” has been floating around since the time of Newton
(see Article 2). However, it is only recently that it has obtained firm footing with theoretical and
observational evidence. There are two sets of evidence which support the existence of additional
energy density — “dark energy” — due to cosmological constant:
1. Budgetary shortfall. The total energy density of the Universe is very close to critical, as
suggested both: (i) theoretically from the inflation in the early Universe; and (ii) observationally from the anisotropies of the CMB radiation. However, the observations can only account
for about a third of the total critical energy density. The remaining, unaccounted, two thirds
of the density in the Universe must be in some smooth, unclustered form — dark energy.
2. Theoretical distance-redshift relations. Given the energy composition of the Universe,
one can put together graphs of theoretical distance (luminosity for instance) versus redshift,
which can be verified observationally. In 1998, two groups (Riess et al. 1998, Astronomical Journal, 116, 1009; Perlmutter et al. 1999, Astrophysical Journal 517, 565) observing
supernovae reported direct evidence for the dark energy.
The evidence is based on the difference between the dependence of the luminosity distance
dL on redshift z in matter-dominated Universe and in the dark energy-dominated Universe.
These dependences are given in Fig. 10. The graph shows that the luminosity density distance
is larger for objects at higher redshifts in a dark energy-dominated Universe. This means
that the objects of fixed intrinsic brightness (“standard candles”) will appear dimmer in the
Universe composed of predominantly dark matter.
Using Luminosity Distance Vs. Redshift Graphs to Detect Dark Energy
Let us illustrate how this direct evidence of the dark energy was obtained from the measurements
of the luminosity distance for Type Ia supernovae, which are considered “standard candles” — their
intrinsic (absolute) luminosity are nearly identical.
The luminosity distance dL given by eq. (189)
dL =
56
χ
,
a
(239)
PHYS 652: Astrophysics
57
where χ is the comoving distance defined in eq. (179) as
Z 1
Z 1
Z t(0)
Z 1
dã
dã
dt̃
dã
=
,
=
=
χ≡
2
˙
˙
a ã H(ã)
a ã2 ã
t(a) a(t̃)
a ãã
ã
(240)
After substituting into eq. (240) above, we obtain
Z 1
Z 1
1
1
dã
dã
p
p
χ(a) =
=
,
−1
2
H0 a ã (1 − Ωde0 )ã + Ωde0 ã
H0 a
(1 − Ωde0 )ã + Ωde0 ã4
(242)
after the change of variables da/dt = aH and recalling H ≡ ȧ/a. Allowing for the non-zero
cosmological constant Λ representing dark energy in addition to matter in a flat Universe (ΩT =
1 = Ωm + Ωde ), we have from the first Friedmann equation (eq. (154)):
2
p
ȧ
1
(241)
= H02 (1 − Ωde0 ) 3 + Ωde0
=⇒
ȧ = H0 (1 − Ωde0 )a−1 + Ωde0 a2 .
a
a
or, in terms of the redshift z, from the relation a = 1/(1 + z):
Z z
dz̃
1
p
,
χ(z) =
H0 0
(1 − Ωde0 )(1 + z̃)3 + Ωde0
The corresponding luminosity distance dL is then given by
Z
1+z z
dz̃
p
dL (z) ≡ χ(z)(1 + z) =
,
H0 0
(1 − Ωde0 )(1 + z̃)3 + Ωde0
(243)
(244)
which is what is used to obtain Fig. 10.
The apparent magnitude m and the absolute magnitude M are related to fluxes by
5
m = − log (F ) + const.,
2
or after recalling that the flux scales as d−2
L (eq. (186))
dL
m = M + 5 log
+ const.
10pc
(245)
(246)
The conventional way to write the relationship between apparent and absolute magnitudes is
m − M = 5 log (dL ) + K,
(247)
where K is a correction for the shifting of the spectrum into or out of the wavelength range measured
due to expansion. When comparing apparent magnitudes m1 and m2 of the two objects of the same
type — with the same absolute magnitude M (such as Type Ia supernova) — the above equation
is equivalent to
dL (m1 )
dL (m2 )
m1 − m2 = 5 log
− 5 log
,
(248)
10pc
10pc
where K is a correction for the shifting of the spectrum into or out of the wavelength range measured
due to expansion. This is because of the way magnitudes are defined: the difference of 5 magnitudes
(mag) is equivalent to the brightness (flux) ratio of 100:
(m1 −m2 )
F2
= 100 5 ,
F1
57
(249)
PHYS 652: Astrophysics
58
where F1 and F2 are fluxes of the two objects and m1 and m2 are their apparent magnitudes.
The methodology of this kind of measurement can be well-illustrated by considering two supernovae from this sample: SN 1997ap at redshift z1 = 0.83 with apparent magnitude m1 = 24.32 and
SN 1992P at redshift z2 = 0.026 and apparent magnitude m2 = 16.08. Since the absolute magnitudes of these are the same (because Type Ia supernovae are “standard candles”), the difference in
apparent magnitudes is entirely due to the difference in luminosity distance:
dL (z1 )
dL (z2 )
m1 − m2 = 5 log
− 5 log
.
(250)
10pc
10pc
The second supernova (SN 1992P) is so close that its luminosity distance is unaffected by cosmology
(see Fig. 18), and subscribes to the Hubble law valid for small redshifts z: dL = z/H0 . The
luminosity distance for SN 1992P is then given by dL (z2 ) = z2 /H0 = 0.026/H0 . The only remaining
unknown in eq. (250) is fixed by observations to be
dL (z = 0.83) = 1.16/H0 .
(251)
For a flat, matter-dominated Universe (ΩT = Ωm = 1), the luminosity distance at z = 0.83 is equal
to 0.95/H0 , while for the Universe with Ωm0 = 0.3 and Ωde0 = 0.7 has the luminosity distance of
1.23/H0 . Therefore, the apparent magnitude of the supernova SN 1997ap suggests that there is a
sizable component of the dark energy.
Luminosity distance dL versus redshift in flat Universe
SN 1997ap
1
distance [1/H0]
Ωde0 = 0.7
Ωde0 = 0
0.1
SN 1992P
0.01
0.01
0.1
1
z
Figure 18: Luminosity distance dL versus the redshift z graphs for the flat matter-dominated Universe (thin
lines) and flat Universe with matter and dark energy corresponding to Ωde0 = 0.7 (thick lines). The two
points are observed luminosity distances for the two Type Ia supernovae: SN 1992P at z = 0.026 and SN
1997ap at z = 0.83.
The two groups measured the apparent magnitudes m for a large set of Type Ia supernovae
and established a systematic bias toward the Universe with a considerable contribution to the total
energy density coming from dark energy (Fig. 19).
58
PHYS 652: Astrophysics
59
The measurement of Type Ia supernovae conducted by the two teams led to the constraints on
the Universe presented in Fig. 20. The two free parameters are the relative content of matter (ΩM )
and the dark energy modeled as a cosmological constant or vacuum energy (ΩΛ ), which is only one
of the possibly ways to model it. Figure 20 seems to confidently rule out the flat matter-dominated
Universe (ΩΛ = 0, ΩM = 1), as well as the open Universe with only matter (ΩM = 0.3).
Figure 21 shows the age of the Universe and its acceleration for different ratios of ΩΛ and ΩM .
Figure 20 allows for a great deal of freedom — the shaded, most probable region is quite
elongated allowing for a broad range of viable ratios.
In order to allow for other forms of dark energy, we allow for dark energy density to be timedependent (and not due to the cosmological constant Λ). Equation of state P = P (ρ) for dark
energy must obey the Friedmann’s second equation (eq. (101b)):
ȧ
dρ
+ 3 (ρ + P ) = 0.
dt
a
(252)
For time-independent dark energy, i.e. due to cosmological constant Λ, the equation of state is
P = −ρ.
(253)
Earlier we introduced a parameter w in the equation of state:
w≡
P
,
ρ
(254)
where w = 0 for matter, w = 1/3 for radiation and w = −1 for dark energy due to cosmological
constant (see Table 3). The two studies of supernovae also computed the likelihood regions in the
(ΩM , w) space in the case of flat Universe. Figure 22 shows that the cosmological constant (w = −1)
is allowed, but not the only possibility.
To compute how the time-dependent dark energy density, as denoted by w = w(t), or equivalently w = w(a), evolves with the expanding Universe, we can solve eq. (252) with w = w(a):
dρ
dt
If w = const., then
Z
ρ ∝ exp −3 (1 + w)
a
ȧ
da
−3 [ρ + w(a)ρ] = −3ρ [1 + w(a)]
adt
Z a
Z aa
da′
dρ
′
=⇒
1 + w(a ) ′
= −3
ρ
a
Z a
da′
′
1 + w(a ) ′ .
=⇒ ρ ∝ exp −3
a
=
da′
a′
o
n
= exp {−3 (1 + w) ln a} = exp ln a−3(1+w) = a−3(1+w) ,
(255)
(256)
which matches ρ ∝ a−3 for w = 0 (matter: dust approximation P = 0), ρ ∝ a−4 for w = 1/3
(radiation: perfect fluid approximation P = ρ/3) and ρ ∝ const. for w = −1 (cosmological constant
Λ: P = −ρ).
There are several “popular” values of w for the dark energy:
• w < −1/3: quintessence,
• w = −1: cosmological constant Λ,
59
PHYS 652: Astrophysics
60
• w < −1: phantom energy.
Alternative to Dark Energy
One approach toward explaining what we perceive as dark energy is to revisit the underlying
assumptions of our cosmological model and the resulting equations, most notably the assumption
of “homogeneity” of the Universe. The Universe only appears homogeneous on the largest scales,
while it has a complicated “Swiss cheese” structure whose expansion differs from the expansion of
the homogeneous model. After revisiting Einstein’s equations, one finds that the inhomogeneity
generates a term analogous to the vacuum energy term. It is still very much an open issue whether
this term is of the sufficient magnitude to cause the Universe to evolve in the manner we observe.
44
MLCS
∆(m-M) (mag)
m-M (mag)
42
40
38
ΩM=0.24, ΩΛ=0.76
36
ΩM=0.20, ΩΛ=0.00
34
ΩM=1.00, ΩΛ=0.00
0.5
0.0
-0.5
0.01
0.10
z
1.00
Figure 19: Luminosity distance dL , given in terms of the difference between the apparent m and absolute
M magnitudes, versus the redshift z for a set of Type Ia supernovae from Riess et al. 1998, Astronomical
Journal, 116, 1009.
60
PHYS 652: Astrophysics
61
Figure 20: Constraints from Type Ia supernovae on the parameters (Ωm0 and Ωde0 ) from Perlmutter et
al 1999, Astrophysical Journal 517, 565. Flat, matter-dominated Universe — denoted by a circle at (1, 0) is
ruled out with high confidence. The straight line extending from upper left to lower right corresponds to a
flat Universe (ΩT = 1 = ΩM + ΩΛ ).
61
PHYS 652: Astrophysics
62
Figure 21: The age of the Universe for different breakdowns between the relative content of the dark energy
(ΩΛ) and matter (ΩM ) from Perlmutter et al 1999, Astrophysical Journal 517, 565. For a flat, matterdominated Universe, we found earlier (eq. (117)) that t0 ≈ 9.1A with h = 0.72, or, for h = 0.63 (as in
Perlmutter et al. 1999), t0 = 10.4A.
62
PHYS 652: Astrophysics
63
Figure 22: Constraints in a flat Universe from Type Ia supernovae on the mater density ΩM and the equation
of state of the dark energy w (Perlmutter et al 1999, Astrophysical Journal 517, 565). Cosmological constant
corresponds to w = −1, and matter to w = 0.
63
PHYS 652: Astrophysics
13
64
Lecture 13: History of the Very Early Universe
“The Universe is full of magical things, patiently waiting for our wits to grow sharper.”
Eden Phillpots
The Big Picture: Today we are going to outline the standard model of the Universe in the first
few minutes following the hot Big Bang. These earliest epochs in the evolution of the Universe
are still inadequately understood. As we move away from the Big Bang, our understanding of the
physical epochs of the Universe becomes increasingly better understood.
Keeping Track of Universe’s History
The different times in the history of the Universe can be tracked by any of the several quantities
which change monotonically throughout: age of the Universe t, scale factor a, redshift (as we observe
it today) z and temperature of the CMB radiation T (currently measured at ≈ 2.7 K).
From eq. (205)
π2 4
T ,
(257)
ργ =
15 γ
and the result derived from Friedmann’s second equation that the radiation scales as
ργ ∝ a−4 ,
(258)
Tγ (a) ∝ a−1 ,
(259)
Tγ0 = Tγ (a = 1) ≈ 2.7K,
(260)
Tγ (a) ≈ 2.7a−1 .
(261)
we obtain that
which, combined with the current measurement of the temperature of the CMB radiation
yields
To relate this to the age of the Universe t, one can explicitly solve integrals for a(t) and substitute
in eq. (261).
The mutual relationship between the quantities t, z, a and Tγ is given in Table 5.
It is beneficial to relate directly — albeit crudely — the temperature T and the age of the
Universe t. This can only be analytically only for matter-dominated or radiation-dominated Universe, as we have done in Lecture 5. (Relating the scale factor a and the age of the Universe t in a
more general case when Universe has matter, radiation and the cosmological constant (as vacuum
energy) requires solving the integral given in Table 5 for t(a) and inverting it. This can only be
done numerically). Therefore, as a rough approximation, let us recall:
2/3 2/3
0
t ,
1. flat, matter-dominated Universe [eq. (115)]: a(t) = 3H
2
2. flat, radiation-dominated Universe [eq. (119)]: a(t) = (2H0 )1/2 t1/2 ,
where
H0 = 100h km sec−1 Mpc−1 = 100h
1000m
1km
H0 ≈ 3.24h × 10−18 sec−1 ≈ 2.3 × 10−18 sec−1 ,
1Mpc
3.0856 × 1022 m
with h ≈ 0.72. Therefore, for the two approximations, we have:
64
km sec−1 Mpc−1
(262)
PHYS 652: Astrophysics
65
1. flat, matter-dominated Universe: a(t) = 2.3 × 10−12 t2/3 ,
2. flat, radiation-dominated Universe [eq. (119)]: a(t) = 2.2 × 10−10 t1/2 .
When these are combined with the eq. (261), we obtain:
1. flat, matter-dominated Universe: Tγ (t) ≈ 1012 t−2/3 K,
2. flat, radiation-dominated Universe: Tγ (t) ≈ 1010 t−1/2 K.
To estimate the age of the Universe in Table 6, we use flat, matter-dominated Universe.
45
10
41
40
10
35
10
10
36
10
31
10
1026
matter
radiation
1025
1021
1020
1016
1015
1011
1010
Planck
GUT
-45
-40
105
10
Inflat.
-35
10
10
Electroweak
-30
10
-25
10
-20
10
Quark
-15
10
-10
10
Hadron
-5
10
T [eV]
T [K]
1030
106
Lepton
1
101
t [s]
Figure 23: The temperature (given in both K and eV) of the Universe (T ) versus the age of the Universe (t)
based on matter-dominated (solid line) and radiation-dominated (dashed line) approximations. The epochs
in the earliest history of the Universe are outlined. [We approximated 1 eV ≈ 104 K (=11605 K)].
Table 5: Relationship between the scale of the Universe (a), age of the Universe (t), redshift as observed
from here today (z) and the temperature of the CMB radiation Tγ .
Quantity
age
t
redshift
scale
temperature
z
a
Tγ
Dependence on scale a
Ra
ãdã
t(a) = H1 0 √
0
z(a)
Ωm0 ã+Ωr0 +Ωde0 ã4
= a1 − 1
–
Tγ (a) = 2.7a−1
65
Dependence on redshift z
R∞
dz̃
√
t(z) =
z
5
6
1
H0
Ωm0 (1+z̃) +Ωr0 (1+z̃) +Ωde0 (1+z̃)2
–
1
a(z) = 1+z
Tγ (z) = 2.7(z + 1)
PHYS 652: Astrophysics
66
The Big Bang: t = 0 s
Extrapolation of the expansion of the Universe backwards in time using general relativity yields an
infinite density and temperature at a finite time in the past. This singularity signals the breakdown
of GR. How closely we can extrapolate towards the singularity is debated — certainly not earlier
than the Planck epoch. The early hot, dense phase is itself referred to as “the Big Bang”, and is
considered the “birth” of our Universe — The Beginning.
The discussion about the nature, cause and origin of the Big Bang itself is untestable and as
such quickly enters the waters of metaphysics and theology.
The Planck Epoch: 0 < t ≤ 10−43 s
The Planck epoch is the earliest period of time in the history of the Universe, spanning the
brief time immediately following the Big Bang during which the quantum effects of gravity were
significant.
In order to compute the time-scale over which quantum effects dominate (barring the existence
of branes which would circumvent them), we use dimensional analysis:
effects
Relativity
Quantum mechanics
Gravitation
constant
c
h
G
value
3 × 1010
6.63 × 10−27
6.67 × 10−8
units
cm
s
2
g cms
cm3
gs2
We need to find the way to combine the constants above to obtain the the relevant time scale:
cA hB GD = s,
cm A cm2 B cm3 D
g
= s,
=⇒
s
s
gs2
[cm] :
[g] :
[s] :
Solution
A
−A
A = − 25
+2B
+B
−B
B=
1
2
+3D
−D
−2D
=0
=0
=1
1
2
=⇒
D=
5
1
1
tP = c 2 h 2 G 2 ,
The time scale for quantum gravity, the Planck time tP , is therefore
r
hG
tP ≡
,
c5
(263)
which numerically is equal to
(6.63 × 10−27 )(6.67 × 10−8 )
tP =
(3 × 1010 )5
1/2
≈ 10−43 s.
(264)
If the supersymmetry is correct, then during this time the four fundamental forces — electromagnetism, weak force, strong force and gravity — all have the same strength, so they are possibly
unified into one fundamental force. Our understanding of this early epoch is still quite tenuous,
awaiting a happy marriage of quantum mechanics and relativistic gravity.
66
PHYS 652: Astrophysics
67
Grand Unification Epoch: 10−43 s ≤ t ≤ 10−36 s
Assuming the existence of a Grand Unification Theory (GUT), the Grand Unification Epoch
was the period in the evolution of the early Universe following the Planck epoch, in which the
temperature of the Universe was comparable to the characteristic temperatures of GUTs. If the
grand unification energy is taken to be 1015 GeV, this corresponds to temperatures higher than
1027 K. During this period, three of the four fundamental interactions — electromagnetism, the
strong interaction, and the weak interaction — were unified as the electronuclear force. Gravity had
separated from the electronuclear force at the end of the Planck era. During the Grand Unification
Epoch, physical characteristics such as mass, charge, flavor and color charge were meaningless.
The Grand Unification Epoch ended at approximately 10−36 s after the Big Bang. At this point,
the strong force separated from the other fundamental forces.
Inflationary Epoch: 10−36 s ≤ t ≤ 10−32 s
The Inflationary Epoch was the period in the evolution of the early Universe when, according
to inflation theory, the Universe underwent an extremely rapid exponential expansion. This rapid
expansion increased the linear dimensions of the early Universe by a factor of at least 1026 (and
possibly a much larger factor), and so increased its volume by a factor of at least 1078 . At this
time, the strong force started to separate from the electroweak interaction.
The expansion is thought to have been triggered by the phase transition that marked the end
of the preceding Grand Unification Epoch at approximately 10−36 s after the Big Bang. One of the
theoretical products of this phase transition was a scalar field called the inflation field. As this
field settled into its lowest energy state throughout the Universe, it generated a repulsive force that
led to a rapid expansion of the fabric of spacetime. This expansion explains various properties of
the current Universe that are difficult to account for without the Inflationary Epoch (flat Universe,
horizon problem, magnetic monopoles).
The rapid expansion of spacetime meant that elementary particles remaining from the Grand
Unification Epoch were now distributed very thinly across the Universe. However, the huge potential energy of the inflation field was released at the end of the Inflationary Epoch, repopulating the
Universe with a dense, hot mixture of quarks, anti-quarks and gluons as it entered the Electroweak
Epoch.
Electroweak Epoch: 10−32 s ≤ t ≤ 10−12 s
The Electroweak Epoch was the period in the evolution of the early Universe when the temperature of the Universe was high enough to merge electromagnetism and the weak interaction into a
single electroweak interaction (≈ 100GeV ≈ 1015 K). At approximately 10−32 s after the Big Bang
the potential energy of the inflation field that had driven the inflation of the Universe during the
Inflationary Epoch was released, filling the Universe with a dense, hot quark-gluon plasma (reheating). Particle interactions in this phase were energetic enough to create large numbers of exotic
particles, including W and Z bosons and Higgs bosons. As the Universe expanded and cooled,
interactions became less energetic and when the Universe was about 10−12 s old, W and Z bosons
ceased to be created. The remaining W and Z bosons decayed quickly, and the weak interaction
became a short-range force in the following Quark Epoch.
After the Inflationary Epoch, the physics of the Electroweak Epoch is less speculative and better
understood than for previous periods of the early Universe. The existence of W and Z bosons has
been demonstrated, and other predictions of electroweak theory have been experimentally verified.
67
PHYS 652: Astrophysics
68
Quark Epoch: 10−12 s ≤ t ≤ 10−6 s
The Quark Epoch was the period in the evolution of the early Universe when the fundamental
interactions of gravitation, electromagnetism, the strong interaction and the weak interaction had
taken their present forms, but the temperature of the Universe was still too high to allow quarks
to bind together to form hadrons. The Quark Epoch began approximately 10−12 s after the Big
Bang, when the preceding Electroweak Epoch ended as the electroweak interaction separated into
the weak interaction and electromagnetism. During the Quark Epoch the Universe was filled with a
dense, hot quark-gluon plasma, containing quarks, gluons and leptons. Collisions between particles
were too energetic to allow quarks to combine into mesons or baryons. The Quark Epoch ended
when the Universe was about 10−6 s old, when the average energy of particle interactions had fallen
below the binding energy of hadrons. The following period, when quarks became confined within
hadrons, is known as the Hadron Epoch.
Hadron Epoch: 10−6 s ≤ t ≤ 1 s
The Hadron Epoch was the period in the evolution of the early Universe during which the
mass of the Universe was dominated by hadrons. It started approximately 10−6 s after the Big
Bang, when the temperature of the Universe had fallen sufficiently to allow the quarks from the
preceding Quark Epoch to bind together into hadrons. Initially, the temperature was high enough
to allow the creation of hadron/anti-hadron pairs, which kept matter and anti-matter in thermal
equilibrium. However, as the temperature of the Universe continued to fall, hadron/anti-hadron
pairs were no longer produced. Most of the hadrons and anti-hadrons were then eliminated in
annihilation reactions, leaving a small residue of hadrons. The elimination of anti-hadrons was
completed by one second after the Big Bang, when the following Lepton Epoch began.
Lepton Epoch: 1 s ≤ t ≤ 3 min
From the time tP of quantum gravity up to the lepton era, the physics of the Universe is
dominated by very high temperatures (> 1012 K) and therefore by high-energy particle physics.
• Muon annihilation:
At sufficiently high temperatures, there is a pair production:
=⇒
γ + γ → µ+ + µ− ,
photon energy → muon mass.
(265)
This can persist only as long as kT ≈ 2mµ c2 :
T
≥
2mµ c2
2(200me )c2
2(2009.1 × 10−28 )(3 × 1010 )2
=
=
= 2 × 1012 K.
k
k
1.38 × 10−16
(266)
Therefore, muons annihilate at T ≈ 1012 K.
• Electron/positron annihilation:
The argument used for muon annihilation applies to electron-positron pair production
T ≥
2me c2
≈ 1010 K,
k
so, electrons and positrons annihilate at T ≈ 1010 K.
68
(267)
PHYS 652: Astrophysics
69
• Decoupling of electron neutrinos:
Assuming the matter-dominated Universe, we crudely estimate electron number density:
a 3
10−29
ρ0
0
6 −2
ρ = ρ0
=
3 ≈ −35 2 = 10 t ,
a
10 t
2.3 × 10−12 t2/3
ne =
ρ
106 t−2
1033
≈
≈ 2 .
−28
me
9.1 × 10
t
(268)
The neutrino scattering cross-section is σν ≈ 10−44 cm2 , so the time between scatterings is
tν ≈
1
.
n e σν c
(269)
Scatterings will become “scarce” when
tν ≈
1
1033
t2ν
σν c
= 1033 σν c = 1033
10−44
3 × 1010 ≈ 0.3 s.
(270)
Therefore, electron neutrinos decouple from the Universe at about t ≈ 1 s.
Table 6: Early history of the Big Bang Universe, up to the Big Bang Nucleosynthesis. Temperature estimates
are based on the crude matter-dominated Universe approximation: T (t) ≈ 1012 t−2/3 K.
Epoch
Big Bang
0s
Planck
0 s < t ≤ 10−43 s
Grand Unification
10−43 s ≤ t ≤ 10−36 s
Inflationary
−36
10
s ≤ t ≤ 10−32 s
Electroweak
−32
10
s ≤ t ≤ 10−12 s
Temperature
∞K
∞ eV
> 1040 K
> 1036 eV
1036 − 1040 K
1026 − 1032 eV
1033 − 1036 K
1029 − 1032 eV
1020 − 1033 K
1016 − 1029 eV
Quark
s ≤ t ≤ 10−6 s
Hadron
−6
10 s ≤ t ≤ 1 s
Lepton
1 s ≤ t ≤ 3 min
1016 − 1020 K
1012 − 1016 eV
1012 − 1016 K
108 − 1012 eV
1010 − 1012 K
106 − 108 eV
10−12
1s
100 s
≤ 1012 K
≤ 108 eV
1010 K, 106 eV
Characteristics
singularity (vacuum fluctuation?)
quantum gravity
gravity freezes out
the “grand unified force” (GUT)
inflation begins
strong force freezes out
weak force freezes out
4 distinct forces (EM dominates)
baryogenesis: baryons and antibaryons annihilate
Universe contains hot quark-gluon plasma:
quarks, gluons and leptons
quarks and gluons bind into hadrons
Universe contains photons (γ), muons (µ± ),
electrons/positrons (e± ), and neutrinos (ν, ν̄);
nucleons n and p in equal numbers
µ+ and µ− annihilate; ν and ν̄ decouple;
e± , γ and nucleons remain. Reactions:
e+ + n ⇋ p + νe
e− + p ⇋ n + νe
n → p + e− + ν̄e
e+ and e− annihilate
69
PHYS 652: Astrophysics
14
70
Lecture 14: Early Universe
“True science teaches us to doubt and, in ignorance, to refrain.”
Claude Bernard
The Big Picture: Today we introduce the Boltzmann equation for annihilation as a tool for
studying the early Universe. We also begin to discuss the Big Bang Nucleosynthesis (BBN) during
which light elements formed.
The very early Universe was hot and dense, resulting in particle interactions occurring much
more frequently than today. For example, while photon can today traverse the entire Universe
without interacting (deflection or capture), resulting in a mean-free path greater than 1028 cm, the
mean-free path of a photon when the Universe was 1 second old was about the size of an atom. This
resulted in a large number of interactions which kept the interacting constituents of the Universe
in equilibrium.
As the Universe expanded, the mean-free path of particles increased — thus decreasing the
rates of interactions — to the point where these could no longer maintain equilibrium conditions.
Different constituents of the Universe decoupled — fell out of equilibrium with the rest of the
Universe — at different times, which determined their abundance.
Falling out of equilibrium played a vital role in:
1. the formation of the light elements during Big Bang Nucleosynthesis (BBN);
2. recombination of electrons and protons into neutral hydrogen when the temperature was on
the order of 14 eV;
3. production of dark matter in the early Universe.
All three of these important phenomena are studied with the same formalism: the Boltzmann
equation.
Boltzmann Equation for Annihilation
The Boltzmann equation generalizes the Friedmann’s second equation which describes how an
abundance of a specie of particles evolves with time
ȧ
= 0,
a
P (ρ) = 0,
(dust approximation for matter)
d
d
ȧ
ρa3 = 0 =⇒ a−3
na3 = 0,
ρ̇ + 3ρ = 0 =⇒ a−3
a
dt
dt
ρ̇ + 3 (ρ + P )
(271)
where n is the abundance (number density) of a specie. The equation above is valid for one specie
in equilibrium, and does not account for creation and annihilation of particles.
The Boltzmann equation relates the rate of change in the abundance of a given particle to the
difference between the rates for producing and eliminating the species. It quantifies the abundance
of a specie 1 (n1 ) involved in a reaction with a specie 2 to produce a pair of species — 3 and 4,
70
PHYS 652: Astrophysics
71
i.e., 1 + 2 ↔ 3 + 4:
−3
a
d
n1 a3 =
dt
Z
d3 p1
(2π)3 2E1
Z
d3 p2
(2π)3 2E2
Z
d3 p3
(2π)3 2E3
Z
d3 p4
(2π)3 2E4
× (2π)4 δ3 (p1 + p2 − p3 − p4 )δ(E1 + E2 − E3 − E4 ) |M|2
× {f3 f4 [1 ± f1 ] [1 ± f2 ] − f1 f2 [1 ± f3 ] [1 ± f4 ]} .
(272)
In the absence of interactions, the right-hand side of the equation above vanishes, and the Boltzmann equation reduces to the second Friedmann’s equation. From the equation above we see
that:
• the rate of production of specie 1 is proportional to the abundance of species 3 and 4;
• the rate of loss of specie 1 is proportional to the abundance of species 1 and 2;
• the likelihood of production of a particle is higher if it is a boson than a fermion: + for Bose
enhancement and - for Pauli blocking; of species 1 and 2;
• Dirac delta function
p enforce energy and momentum conservation (energies are related to the
momenta by E = p2 + m2 ;
• (2π)4 factor comes from replacing discrete Kronecker delta with continuous Dirac delta function;
• the amplitude M is determined from the physical processes taking place (∝ α, the fine
structure constant for Compton scattering);
• to find the total number of interactions, we must integrate over all momenta;
• the factor 2E in the denominator arises because the phase-space integrals are four-dimensional
(4-momentum) — three components of spatial momenta and one of energy — and confined
to lie on a 3-sphere determined by E 2 = p2 + m2 .
The Boltzmann equation for annihilation in the context of cosmological applications is aided
by several simplifications:
• Scattering processes typically enforce kinetic equilibrium — the scattering takes place so
rapidly that the distributions of various species have the generic BE or FD forms. The only
unknown then is µ, which now is a function of time. If the annihilations were to take place in
equilibrium, µ would be the chemical potential, and the left- and the right-hand side would
have to balance in a reaction: µ1 + µ2 = µ3 + µ4 . For out-of-equilibrium cases, the system is
not in chemical equilibrium, which yields a differential equation for µ.
• In the cosmological applications we considered here, the temperatures T are smaller than
the quantity E − µ, which makes the term exp [(E − µ)/T ] ≫ 1, so exp [(E − µ)/T ] ± 1 ≈
exp [(E − µ)/T ], yielding another simplification:
fF D (E) = fBE (E) = f (E) =
71
1
e(E−µ)/T
= eµ/T e−E/T .
(273)
PHYS 652: Astrophysics
72
This also means that exp [−(E − µ)/T ] ≈ f ≪ 1, so that 1 ± f1 ≈ 1. These approximations
cause the last line of the Boltzmann equation [eq. (272)] to simplify to
f3 f4 [1 ± f1 ] [1 ± f2 ] − f1 f2 [1 ± f3 ] [1 ± f4 ]
≈ f3 f4 − f1 f2
3 +E4 )/T
1 +E2 )/T
= e(µ3 +µ4 )/T e−(E
− e(µ1 +µ2 )/T e−(E
i
h
= e−(E1 +E2 )/T e(µ3 +µ4 )/T − e(µ1 +µ2 )/T .
(274)
We have also used the conservation of energy here E1 + E2 = E3 + E4 . This now constitutes a
integrodifferential equation for µi . It is, however, convenient to directly solve for the number
densities ni by relating the two via
Z
Z
d3 p
d3 p −Ei /T
µi /T
ni ≡ gi
f
=
g
e
e
,
(275)
i
i
(2π)3
(2π)3
where gi is the degeneracy of the species.
(0)
It is useful to define the equilibrium number density ni :
 3/2
Z
3
d p −Ei /T  gi m2πi T
e−mi /T
(0)
e
=
ni ≡ gi
 g T3
(2π)3
i π2
so that
ni
eµi /T =
(0)
ni
mi ≫ T ,
(276)
mi ≪ T ,
,
(277)
so that the last line of the Boltzmann equation now becomes
−(E1 +E2 )/T
e
h
(µ3 +µ4 )/T
e
(µ1 +µ2 )/T
−e
i
−(E1 +E2 )/T
=e
"
n3 n4
(0) (0)
n3 n4
−
n1 n2
(0) (0)
n1 n2
#
.
(278)
After defining the thermally averaged cross section as
Z
Z
Z
Z
1
d3 p2
d3 p3
d3 p4
d3 p1
hσvi ≡
e−(E1 +E2 )/T
(0) (0)
(2π)3 2E1
(2π)3 2E2
(2π)3 2E3
(2π)3 2E4
n1 n2
× (2π)4 δ3 (p1 + p2 − p3 − p4 )δ(E1 + E2 − E3 − E4 ) |M|2 ,
(279)
the Boltzmann equation simplifies to
"
#
d
n
n
n
n
(0)
(0)
3
4
1
2
a−3
n1 a3 = n1 n2 hσvi (0) (0) − (0) (0) .
dt
n3 n4
n1 n2
(280)
This is a simple first order differential equation for the number density ni . Although some of the
details will be application-dependent (i.e., dependent on which particles are interacting), we will
use this to treat three different reactions:
1. neutron-proton ratio:
n + νe → p + e− ,
n + e+ → p + ν̄e ,
72
(281)
PHYS 652: Astrophysics
73
2. recombination:
e+p →H+γ
(282)
X + X → l + l.
(283)
3. dark matter production:
Saha equation. The left-hand
side of the Boltzmann equation given in (280) is of the order of
d
−3
3
Hn1 (since a dt n1 a = ṅ1 + 3 aȧ n1 ∝ Hn1 ), while the right-hand side is of order n1 n2 hσvi.
Therefore, if the reaction rate is much larger than the expansion rate: n2 hσvi ≫ H, then the terms
on the right-hand side will be much larger than the terms on the left-hand side. In order for the
equality to be preserved, the terms in the brackets on the right-hand side should cancel each other
out (be extremely close to each other). This yields the Saha equation:
n3 n4
(0) (0)
n3 n4
=
n1 n2
.
(0) (0)
n1 n2
(284)
Big Bang Nucleosynthesis (BBN)
As the temperature of the early Universe cools to 1 MeV, the cosmic plasma consists of:
• Relativistic particles in equilibrium: photons, electrons and positrons.
These interact among themselves via electromagnetic interaction e+ e− ↔ γγ. The abundances of these constituents are given by Fermi-Dirac and Bose-Einstein statistics.
• Decoupled relativistic particles: neutrinos.
At temperatures above 1 MeV, the rate of interactions such as νe ↔ νe which keeps neutrinos
coupled to the rest of the plasma drops below the rate of expansion of the Universe. Therefore,
neutrinos have the same temperature as the other relativistic particles, and hence are roughly
as abundant, but they do not couple to them.
• Nonrelativistic particles: baryons.
If the number of baryons and antibaryons was completely symmetric, they would completely
annihilate away by 1 MeV. However, there was an initial asymmetry between baryons and
antibaryons
nb − nb̄
≈ 10−10 ,
(285)
s
throughout the early history of the Universe, until the antibaryons were annihilated away at
about T ≈ 1 MeV. The resulting ratio between baryons and photons is given in terms of the
present-day baryon content of the Universe Ωb and the current Hubble rate h as
ρb
ηb
ρcr Ωb
nb
mp
mp
≡
=
=
nγ
nγ
nγ
2
1.87h × 10−29 g cm−3
= 2.725 × 10−8 Ωb h2
= Ωb
1.673 × 10−24 g 411cm−3
2
−10 Ωb h
= 5.45 × 10
,
0.02
73
(286)
PHYS 652: Astrophysics
74
where we have used nγ = 411 cm−3 (Homework set #2) and the critical density computed
on top of the page 21 of the notes: ρcr = 1.87h2 × 10−29 g cm−3 . Therefore, there are orders
of magnitude more relativistic particles than baryons at about T ≈ 1 MeV.
The goal of these next few lectures is to determine how the baryons arrange themselves. If the
equilibrium was maintained throughout the expansion, the final state of baryons would only be
dictated by energetics — all baryons would end up in iron, the element with the highest binding
energy. However, nuclear reactions are too slow to keep the Universe in equilibrium as its temperature drops. Therefore, the reactions do not lead up to iron, but stop at light elements when the
Universe becomes sparse enough to keep the further reactions from taking place.
In order to understand what happens to the baryons, we need to solve a set of coupled Boltzmann
differential equations [eq. (272)] for all reactions which are taking place. This indeed is a daunting
task, which is greatly ameliorated by two simplifications:
1. No elements heavier than helium are produced at appreciable levels (with the exception of
lithium at one part in 109 − 1010 ). Therefore, the only nuclei that need to be traced are
hydrogen (H) and helium (He), and their isotopes: deuterium (2 H or D),
2. The physics separates rather neatly into two parts since no light nuclei form above T ≈ 0.1
MeV — only free protons and neutrons exist. This means that we first have to solve for
neutron/proton abundance, and then use that result as input for the formation of nucleons
of light elements.
These simplifications rely on the physical fact that, at high temperatures comparable to binding
energies, whenever a nucleus is formed in a reaction, it is destroyed by a collision with a highenergy photon. This can be quantified by the Saha equation [eq. (284)]. Let us consider binding of
a neutron and proton into a nucleus of deuterium:
n + p → D + γ.
(287)
(0)
Photons have nγ = nγ , the Saha equation becomes
n3 n4
(0) (0)
n3 n4
=⇒
nD nγ
nn np
=
(0) (0)
n1 n2
(0) (0)
n1 n2
=⇒
n3 n4
n n
= 3(0) 4(0)
n1 n2
n1 n2
=⇒
n
nD
= (0)D (0)
nn np
nn np
(0) (0)
=
nD nγ
(0) (0)
nn np
(0)
(288)
We are considering how this reaction takes place when the temperature of the Universe is on the
order of the binding energy of deuterium, which is BD = 2.22 MeV. The masses of protons and
neutrons are mp = 938.27 MeV and mn = 939.56 MeV, and the mass of deuterium is mD =
mp + mn − BD = 1877.62 MeV, which means that we use the mi ≫ T regime of eq. (276), to obtain
(note: gD = 3 because of 3 spin states of D, and gp = 2 and gn = 2 because of their spin states):
3/2
mD T
e−mD /T
g
D
2π
nD
=
3/2
nn np
mp T 3/2 −mp /T
nT
gn m2π
e−mn /T gp 2π
e
−3/2 3/2
T
mD
gD
e−(mD −mn −mp )/T
=
gn gp 2π
mn mp
3
2πmD 3/2 BD /T
=
e
,
(289)
4 mn mp T
74
PHYS 652: Astrophysics
75
because BD = mn + mp − mD . If we approximate mD ≈ 2mp and mn ≈ mp (which is valid to
within 0.15%), the equation above becomes
nD
3
≈
nn np
4
4π
mp T
3/2
eBD /T
(290)
Because both neutron and proton density are proportional to the baryon density nb , the equation
above further simplifies into
nD
nn np
nD
nb
3
nD
4π 3/2 BD /T
≈
≈
e
=⇒
nb nb
4 mp T
3
4π 3/2 BD /T
4π 3/2 BD /T
3
e
≈ ηb nγ
e
≈ nb
4 mp T
4 mp T
4π 3/2 BD /T
12
3 T3
T 3/2 BD /T
e
e
≈ 1/2 ηb
= ηb 2 2
4 π
mp T
mp
π
nD
T 3/2 BD /T
=⇒
e
≈ 6.77 ηb
nb
mp
T 3/2 BD /T
nD
∼ ηb
e
.
=⇒
nb
mp
(291)
As long as BD /T is not too large (and we are doing this analysis in the regime BD ∼ T ), the
prefactor dominates. Not only is mp ≫ T , and hence T /mp ≪ 1, but the baryon-to-photon ratio
ηb is extremely small [see eq. (286)], so the right-hand side of the equation above vanishes. This
means that the density of deuterium nuclei also vanishes.
Small baryon-to-photon ratio thus inhibits nuclei production until the temperature drops well
beneath the nuclear binding energy (T ≪ BD ). This is why at temperatures T > 0.1 MeV virtually
all baryons are in the form of neutrons and protons. Around this temperature, the production of
deuterium and helium starts, but the reaction rates are too low to produce heavier elements. Not
having a stable isotope with mass number 5 means that heavier elements cannot be produced via
reaction
4
H + p → X.
(292)
The heavier elements are formed in stars (triple alpha process):
4
He + 4 He + 4 He → 12 C,
(293)
but that is only much later. The early Universe is too sparse for these reactions to take place,
i.e. for three helium nuclei to find one another on relevant timescales.
75
PHYS 652: Astrophysics
15
76
Lecture 15: Big Bang Nucleosynthesis (BBN) continued
“Not only is the Universe stranger than we imagine, it is stranger than we can imagine.”
Sir Arthur Eddington
The Big Picture: Today we continue to discuss the Big Bang Nucleosynthesis.
The lack of stable nuclei with atomic weights of 5 or 8 limited the Big Bang to producing
hydrogen, helium and their isotopes. Burbidge, Burbidge, Fowler and Hoyle (1957, Reviews of
Modern Physics, 29, 547) worked out the nucleosynthesis processes that go on in stars, where the
much greater density and longer time scales allow the triple-alpha process (He+He+He→C) to
proceed and make the elements heavier than helium. But they could not produce enough helium.
Now we know that both processes occur: most helium is produced in the BBN but carbon and
everything heavier is produced in stars. Most lithium and beryllium is produced by cosmic ray
collisions breaking up some of the carbon produced in stars.
Figure 24: Nuclear binding energy curve.
76
PHYS 652: Astrophysics
77
Neutron Abundance
Let us now compute the neutron-proton ratio. Neutrons and protons can be converted into
each other via weak interaction:
p + e− ↔ n + νe
p + ν̄e ↔ n + e+
n ↔ p + e− + ν̄.
(294)
When mp , mn ≫ T and the nucleons are in a non-relativistic regime (E = m + p2 /2m), we use
the appropriate portion of eq. (276):
(0)
np
(0)
nn
=
gp
gn
mp T
2π
3/2
e−mp /T
=
mn T 3/2 −mn /T
e
2π
mp
mn
3/2
e(mn −mp )/T ≈ eQ/T ,
(295)
where we have used mp /mn ≈ 1 and defined Q ≡ mn −mp = 1.293 MeV. The equation above states
that at high temperatures T ≫ Q, there are as many neutrons as protons. As the temperature
drops beneath 1 MeV, the neutron fraction goes down. If these weak interactions were efficient
enough to maintain equilibrium, the proton-neutron ratio in eq. (295) would grow to infinity, which
means that the abundance of neutrons relative to protons would be negligible. However, this is not
the case (as clearly as we are here!).
To enable a clearer analysis of neutron-proton interaction, define a ratio of neutrons to total
nucleons:
nn
.
(296)
Xn ≡
nn + np
(0)
(0)
In equilibrium np → np , nn → nn , so
(0)
Xn → Xn,EQ ≡
nn
(0)
nn
+
(0)
np
=
1
1+
(0)
(0)
(np /nn )
.
(297)
Let us now track the evolution of Xn in the weak reaction where neutron and proton convert
into each other and produce leptons [first two reactions in eq. (294)]. In terms of the Boltzmann
equation and the format of reaction 1 + 2 → 3 + 4, 1 = neutron, 3 = proton, 2,4 = leptons in
(0)
complete equilibrium (nl = nl ). The Boltzmann equation [eq. (280)] then reads:
"
#
n3 n4
n1 n2
(0) (0)
−3 d
3
a
n1 a
= n1 n2 hσvi (0) (0) − (0) (0)
dt
n n
n1 n2
#
"
#
" 3 4
(0)
np nn
np nl
nn nl
(0)
3
(0) (0)
−3 d
− nn .
nn a
= nn nl hσvi (0) (0) − (0) (0) = nl hσvi
a
(0)
dt
np n
nn n
np
l
l
(298)
From eq. (295), we have that
(0)
nn
(0)
np
= e−Q/T . Also, we define
(0)
λnp ≡ nl hσvi,
77
(299)
PHYS 652: Astrophysics
78
as the rate of neutron-proton conversion, because it multiplies nn in the loss term. If we write
nn = Xn (nn + np ), then we can rewrite the left-hand side of the eq. (298) as
d −3 d
3
3
3
3
−3 d
−3 dXn
a
(nn + np )a + Xn
nn a
Xn (nn + np )a = a
(nn + np )a
= a
dt
dt
dt
dt
dXn
=
(nn + np ),
(300)
dt
d
(nn + np )a3 = 0.
since, as we derived earlier, ρb a3 = const., so nb a3 = (nn +np )a3 = const., and dt
The right-hand side of the eq. (298) is simplified after expressing
nn =
Xn
np ,
1 − Xn
(301)
to yield
o
n
Xn
np
dXn
−Q/T
(nn + np ) = λnp np e
np =
λnp (1 − Xn )e−Q/T − Xn
−
dt
1 − Xn
1 − Xn
o
n
o
n
nn
nn
λnp (1 − Xn )e−Q/T − Xn = nn λnp (1 − Xn )e−Q/T − Xn
=
Xn
nn +np
o
n
dXn
(302)
= λnp (1 − Xn )e−Q/T − Xn .
=⇒
dt
The equation above is a function of temperature T and the reaction rate λnp , which both depend
on time. We further “massage” the Boltzmann equation for interaction of neutron-nucleon ratio
above by introducing the evolution variable
x≡
Q
,
T
(303)
so the left-hand side of the eq. (302) becomes
"
"
#
#
dXn
QṪ
Ṫ
dXn dx
dXn
dXn
− 2 =
−x
.
=
=
dt
dx dt
dx
T
dx
T
(304)
But, since T ∝ a−1 [see eq. (259)], or T = ka−1 ,
k d a−1
−ȧa−2
ȧ
Ṫ
= dt −1 =
= −ȧa−1 = − ≡ −H = −
−1
T
ka
a
a
r
8πGρ
.
3
(305)
Now we needs to estimate the energy density of the Universe ρ. The BBN takes place during the
radiation-dominated era, so the energy density of the Universe ρ will be determined by the relativistic particles. We saw earlier [eq. (205) where g = 2] that the contribution from the relativistic
particles is
"
#
X
π2 4 π2 4
7 X
ρ = g⋆ T ≡
T
gi +
(306)
gi ,
30
30
8
i=bosons
i=fermions
where g⋆ is an effective number of relativistic degrees of freedom, and where a factor 7/8 comes from
a + in the denominator for FD distribution. g⋆ is a function of temperature, because reactions
constantly reshuffle relative abundances of fermions and bosons. At the time of the BBN, the
temperature is on the order of 1 MeV, at which time the contributing relativistic particles were:
78
PHYS 652: Astrophysics
79
• photons: gγ = 2 (2 spin states);
• neutrinos: gν = 6 (6 flavors);
• electrons: ge− = 2 (2 spin states);
• positrons: ge+ = 2 (2 spin states);
with the total being
We also have that the
H(x) =
=
≡
7
70
g⋆ = 2 + (6 + 2 + 2) = 2 +
= 2 + 8.75 = 10.75.
8
8
Hubble rate H(x) can be expressed in terms of H(x = 1):
r
π2
8πGρ
ρ = g⋆ T 4
3
30
r
r
2
4
3
4
√
8πGg⋆ π T
4π GQ
Q
= x−2
10.75
T =
4
90x
45
x
x−2 H(x = 1).
We compute the Hubble rate at x = 1 to be
r
4π 3 GQ4 √
10.75 = 1.13 s−1 .
45
After substituting eqs. (304)-(305) and (308)-(309) into eq. (302), we obtain
dXn
dXn
=
xH = λnp (1 − Xn )e−x − Xn
dt
dx
dXn
λnp
λnp =
(1 − Xn )e−x − Xn
(1 − Xn )e−x − Xn =
−2
dx
xH
xH(x = 1)x
xλnp dXn
=
=⇒
(1 − Xn )e−x − Xn .
dx
H(x = 1)
H(x = 1) =
(307)
(308)
(309)
(310)
The rate of neutron-proton conversion λnp is defined as
λnp = n(0)
νe hσvi,
(311)
and can be computed from eq. (279) (Extra credit on Homework set #2: problem 3b — note a
typo in definition of τn in the textbook: it should be τn−1 ) to yield:
255
2
λnp =
x
+
6x
+
12
,
(312)
τ n x5
where τn = 886.7s is the neutron mean lifetime. From the equation above, we see that when
T ≈ Q, i.e. , when x ≈ 1, the conversion rate is 5.5s−1 , which is somewhat larger than the
expansion rate H(x = 1) = 1.13s−1 . As the temperature drops below 1 MeV, the rate rapidly falls
below the expansion rate, so conversions become rare. One can compute the temperature at which
the expansion rate H and the neutron-proton conversion rate λnp are equal:
H(x)
=
λnp (x)
255
2
H(x = 1)x−2 =
x
+
6x
+
12
τ n x5
255
x2 + 6x + 12
H(x = 1) =
3
τn x
=⇒ x = 1.9
=⇒
T = Q/x = 1.293/1.9 MeV = 0.68 MeV.
79
(313)
PHYS 652: Astrophysics
80
Note that for T ≈ 1 MeV, this rate of neutron-proton conversion is about three orders of magnitude
larger than the free neutron decay rate τn−1 = 1.1 × 10−3 s−1 .
The approximations incorporated into this derivation of Xn are:
• Boltzmann approximation to BE and FD statistics;
• vanishing me (in computing λnp );
• constant g⋆ throughout.
Computation of Xn can be and has been done without these approximations and the resulting
curves are shown in Fig. 25. The results obtained by numerical integration of the eq. (310) are
also plotted in the same figure. The approximation in the eq. (310) agree extremely well with the
solution obtained without the above assumptions for temperatures T > 0.1 MeV. For temperatures
below that vanishing electron mass is no longer a good approximation (me = 0.5 MeV > T ), and
the results become increasingly inaccurate. The solution of the eq. (310) falls out of equilibrium at
about T ≈ 1 MeV and “freezes out” at about 0.15 once the temperature falls below 0.5 MeV.
Figure 25: Evolution of light element abundances in the early Universe. Heavy solid lines are results from
Wagoner (1973, Astrophysical Journal, 179, 343) code; dashed curve is from integration of eq. (310); light
solid curve is twice the neutron equilibrium abundance. There is a good agreement of eq. (310) and the
exact result until the onset of neutron decay. Neutron abundance falls out of equilibrium at T ∼ 1 MeV.
At T < 0.1 MeV, two additional reactions become important and affect the neutron abundance:
• neutron decay: n → p + e− + ν̄;
• deuterium production: n + p → D + γ.
80
PHYS 652: Astrophysics
81
Neutron decay can be accounted for easily by multiplying the results of the eq. (310) by a factor
e−t/τn . These will become as important as the neutron-proton conversion considered in the eq. (310)
when the rates become equal:
255
λnp = 1/τn
=⇒
1 = 5 x2 + 6x + 12
x
1.293
Q
=
MeV = 0.16 MeV.
(314)
=⇒ x = 7.92
=⇒
T =
x
7.92
By the time this happens, electrons and positrons have annihilated, so the effective number of
relativistic degrees of freedom g⋆ in eq. (306) is found by
#
#
"
"
4
2
X
X
X
X
7
T
π2 4 π2
π
7
ν
gi + Tν4
ρ = g̃⋆ T ≡
T4
gi =
gi +
gi
T4
30
30 γ
8
30 γ
Tγ
8
i=bosons
i=fermions
i=bosons
i=fermions
"
"
4 #
4/3 #
π2 4
Tν
4
π2 4
7
π2 4
21
ρ =
=
Tγ 2 +
6 =
Tγ 2 +
T [3.36]
30
Tγ
8
30
11
4
30 γ
=⇒ g̃⋆ = 3.36,
(315)
where we have used the result from eq. (220): Tν /Tγ = (4/11)1/3 . The time-temperature relation
is found after recognizing, again, that
r
r
r
8πGρ
8πGg̃⋆ π 2 T 4
4π 3 GQ4 √
−2
=
=x
3.36
H(x) =
4
3
90x
45
≡ x−2 H̃(x = 1),
(316)
where
H̃(x = 1) =
r
4π 3 GQ4 √
3.36 = 0.632 s−1
45
or
√
H(x = 1) 3.36
√
,
10.75
(317)
and
H(x) = x
−2
H̃(x = 1)
=
Ṫ
H̃(x = 1)
=−
=⇒
2
Q
T
Z
Z
dT
H̃(x = 1)
dt = −
=⇒
Q
T3
Q2
t=
T −2
=⇒
2H̃(x = 1)
0.1MeV 2
.
=⇒
t = 132 s
T
T2
Q
T
−2
H̃(x = 1) = −
Ṫ
T
H̃(x = 1)
Ṫ
=− 3
2
Q
T
H̃(x = 1)
1 −2
t= T
Q2
2
2
2
10 Q
0.1MeV 2 102 (1.293)2 0.1MeV 2
t=
=
T
2(0.632)
T
2H̃(x = 1)
(318)
The BBN — the start of production of deuterium and other light elements — starts around T ≈ 0.07
MeV (as we will see shortly), by which time the decays have depleted neutron fraction by a factor
"
#
0.1MeV 2
132 s 0.07MeV
exp [−t/τc ] = exp −
= exp [−(132/886.7)(0.1/0.07)] = 0.74.
(319)
886.7 s
So, the neutron abundance at the start of the BBN is 0.15 × 0.74 = 0.11, or
Xn (Tnuc ) = 0.11.
81
(320)
PHYS 652: Astrophysics
16
82
Lecture 16: Light Element Abundances and Recombination
“We are just an advanced breed of monkeys on a minor planet of a very average star. But we can
understand the Universe. That makes us something very special.”
Stephen Hawking
The Big Picture: Today we continue our exposition of the Big Bang Nucleosynthesis, by discussing the abundances of light elements. We also discuss the recombination epoch of the Universe,
when the first atoms began to form, and the Universe became opaque.
Review of Processes Leading Up to the Big Bang Nucleosynthesis
In order to understand which processes were taking place early in the Universe, we need to
compute the reaction rates and compare them to the rate of the expansion of the Universe H.
Earlier, we have found that the expansion rate H and the neutron-proton conversion rates due to
weak reactions λnp become equal at T ≈ 0.68 MeV [eq. (313)], and that the neutron decay rate τn−1
becomes equal to λnp at T ≈ 0.16 MeV [eq. (314)]. The remaining equality between the neutron
decay τn−1 and the expansion H is easily found by solving:
q
p
−1
−2
τn = H(x) = H̃(x = 1)x
=⇒ x = H̃(x = 1)τn = (0.632s−1 )(886.7s) = 23.64
=⇒ T = Q/x = 1.293/23.64 MeV
=⇒ T = 0.055 MeV.
Temp.
> 1 MeV
≈ 0.68 MeV
≈ 0.16 MeV
≈ 0.055 MeV
< 0.1 MeV
≈ 0.07 MeV
≈ 0.07 MeV
Description
weak reactions on the right maintain the
neutron-nucleon ratio in thermal equilibrium
weak reaction rates λnp become slower than expansion H;
neutron-nucleon rate eventually “freezes out” at ≈ 0.15
neutron decay rate τn−1 is equal to weak reactions rate λnp
neutron decay rate τn−1 is equal to the expansion H
the only reaction that appreciably changes the number of
neutrons is neutron decay (τn = 886.7 s)
deuterium nuclei production begins (BBN starts)
helium nuclei production begins (with photon emission);
these reactions are slower because of the abundance
of photons
≈ 0.07 MeV
helium nuclei production begins (without photon emission)
< 0.05 MeV
helium nuclei production finishes
(electrostatic repulsion of nuclei of D causes it to stop);
most neutrons in the Universe end up in 4 He nuclei
deuterium nuclei abundance “freezes out” at ≈ 10−4 − 10−5
< 0.01 MeV
82
(321)
Reactions
p + e− ↔ n + νe
p + ν̄ ↔ n + e+
n → p + e− + ν̄
p+n→D+γ
D + n → 3H + γ
3 H + p → 4 He + γ
D + p → 3 He + γ
3 He + n → 4 He + γ
D + D → 3 He + n
D + D → 3H + p
3 H + D → 4 He + n
3 He + D → 4 He + p
D + D → 4 He + γ
PHYS 652: Astrophysics
83
Figure 26: Rates of reaction between protons and neutrons in the early Universe, compared to the relative
abundance of elements. λnp is the rate of reactions p + l ↔ n + l; τn−1 is the rate of neutron decay; and H
is the expansion of the Universe (top line is before and bottom after e− /e+ annihilation.
83
PHYS 652: Astrophysics
84
Light Element Abundances
Nuclei of light elements are produced as the temperature of the Universe drops below T = Tnuc .
The first to be produced are the nucleons of the deuterium, via the reaction p + n → D + γ. If the
Universe stayed in equilibrium, all neutrons and protons would form deuterium, which means that
the equilibrium deuterium abundance is on the order of baryon abundance. From the eq. (291), we
can see that the equilibrium deuterium-baryon ratio is of order unity when:
Tnuc 3/2 BD /Tnuc
nD
≈ 6.77 ηb
e
=1
nb
mp
Tnuc
BD
3
=⇒
Tnuc ≈ 0.07 MeV.
(322)
≈−
=⇒ ln(6.77 ηb ) + ln
2
mp
Tnuc
The binding energy of helium is larger that of deuterium, which is why the factor eB/T in eq. (291)
favors production of helium over deuterium. As can be seen from Fig. 26, production of helium
starts almost immediately after deuterium starts forming. According to the Fig. 26, virtually all
neutrons at T ≈ Tnuc are turned into nuclei of 4 He. There are two neutrons and two protons in a
nucleus of 4 He, which means that the final abundance of 4 He is equal to about a half of neutron
abundance at the onset of nucleosynthesis (T = Tnuc ). If we define a mass fraction
X4 =
4n4 He
= 2Xn (Tnuc ) = 0.22,
nb
(323)
where we have used eq. (320): Xn (Tnuc ) = 0.11. This approximates to the exact solution well:
Yp = 0.2262 + 0.0135 ln(ηb /10−10 ).
(324)
One important feature of this exact result is that the dependence of the helium-baryon ratio has
only a logarithmic dependence on the baryon fraction ηb . This means that the abundance of helium
will not be a good probe in determining the baryon energy density Ωb . The value of the abundance
of 4 He hinges on the presence of a hot radiation field which prevents the formation of deuterium
before T = 0.1 MeV. Therefore, the fact that presently most of the matter is in the form of
hydrogen, i.e., not all the matter has transformed into 4 He, is a strong argument for the existence
of a primeval cosmic background radiation.
Figure 26 shows that a portion of the deuterium remains unprocessed into helium, because the
reaction which does this D + p → 3 He + γ is not entirely efficient. It shows that the depletion
of deuterium eventually “freezes out” at a level of order 10−5 − 10−4 . The rate of this reaction
depends on the baryon density: if there are plenty of baryons to interact, the reactions will proceed
effectively; if the density of baryons is low, the depletion of deuterium will not be as effective.
Therefore, abundance of deuterium is a powerful probe of the baryon density, as can be seen from
Fig. 27. The measurements of primordial deuterium abundance show that the ratio of deuterium
to hydrogen is D/H = 3.0 ± 0.4 × 10−5 , which corresponds to Ωb h2 = 0.0205 ± 0.0018.
BBN Summarized
The BBN lasted for only a few minutes (during the period when the Universe was from 3 to
about 20 minutes old). After that, the temperature and density of the Universe fell below that
which is required for nuclear fusion. The brevity of BBN is important because it prevented elements
heavier than beryllium from forming while at the same time allowing unburned light elements, such
as deuterium, to exist.
84
PHYS 652: Astrophysics
85
The key parameter which allows one to calculate the effects of BBN is the baryon-photon ratio
ηb . This parameter corresponds to the temperature and density of the early Universe and allows one
to determine the conditions under which nuclear fusion occurs. From this we can derive elemental
abundances. Although ηb is important in determining elemental abundances, the precise value
makes little difference to the overall picture. Without major changes to the Big Bang theory itself,
BBN will result in mass abundances of about 75% of H, about 25% 4 He, about 0.01% of deuterium,
trace (on the order of 10−10 ) amounts of lithium and beryllium, and no other heavy elements. Small
amounts of 7 Li and 7 Be are produced through reactions:
4
4
He + 3 H →
3
He + He →
7
−
Be + e
→
7
Li + γ
7
Be + γ
7
Li + νe .
(325)
Heavier elements are not produced in significant amounts, since there are no stable nuclei for mass
numbers A = 5 and A = 8. The BBN is completed when all neutrons present at T = 0.1 MeV
(Xn ≈ 0.15) have been converted into deuterium (only a small fraction) and 4 He (dominates).
That the observed abundances in the Universe are generally consistent with these abundance
numbers is considered strong evidence for the Big Bang theory.
Figure 27:
Constraint on the baryon density from the BBN. Predictions are shown for the four light
elements — 4 He, deuterium (D), 3 He and lithium (Li). The boxes represent observations. There is only an
upper limit on the primordial abundance of 3 He. (Burles, Nollett & Turner 1999, astro-ph/9903300).
85
PHYS 652: Astrophysics
86
Recombination
When the temperature of the Universe drops to about T ≈ 1 eV, photons remain tightly coupled
to electrons via Compton scattering and electrons to protons via Coulomb scattering. Even though
this temperature is significantly below the binding energy of the hydrogen electron of ǫ0 = 13.6
eV, whenever a hydrogen atom is created, it is immediately ionized again by a high-energy photon.
This delay is caused by the high photon-baryon ratio, and is similar to the delay we have seen in
production of nuclei of light elements.
The Saha equation for the reaction which forms hydrogen atoms e− + p → H + γ is given by
(0) (0)
ne np
ne np
=
.
(0)
nH
nH
(326)
The equation above is simplified when we realize that the Universe is neutral in charge, which
means ne = np . We can now define a free electron fraction:
Xe ≡
np
ne
=
,
ne + nH
np + nH
(327)
and rewrite the left-hand side of the eq. (326) in terms of Xe :
ne np
ne np
1
Xe2
(ne + nH )2
=
=
X
X
(n
+
n
)
=
(ne + nH ).
e
p
e
H
nH
(ne + nH )2
nH
1 − Xe
1 − Xe
The right-hand side of the eq. (326) is obtained from the eq. (276):
mp T 3/2 −mp /T
me T 3/2 −me /T
(0) (0)
e
e
g
g
p
e
2π
2π
ne np
=
3/2
(0)
nH
HT
gH m2π
e−mH /T
me T 3/2 −ǫ0 /T
ge gp me T 3/2 −(me +mp −mH )/T
e
=
e
,
=
gH
2π
2π
(328)
mH ≈ mp
(329)
where we have recognized that ǫ0 = me + mp − mH . Saha equation then reads:
Xe2
1
=
1 − Xe
ne + nH
me T
2π
3/2
e−ǫ0 /T ,
(330)
If we neglect a relatively small number of helium atoms, and recall that ne = np , then the denominator in the equation above is ne + nH = np + nH ≈ nb . A good approximation of the baryon
number density nb is found by combining eqs. (276) and (286):
2 T 3 −10 Ωb h
nb ≡ ηb nγ = 5.5 × 10
(331)
2 2 ≈ 10−10 T 3 .
0.02
π
This means that when the temperature of the Universe is of the order of ǫ0 = 13.6eV, the right-hand
side of the eq. (330) is
3/2
3/2
1
−1
10 me
10 −3 me ǫ0
e = 10
RHS(T = ǫ0 ) = 10 ǫ0
2π
ǫ0
e(2π)3/2
3/2
5.1 × 105 eV
2.34 × 10−2 ≈ 1.7 × 1015 .
(332)
= 1010
13.6 eV
86
PHYS 652: Astrophysics
87
Since Xe is, by definition 0 ≤ Xe ≤ 1, the only way that the equality in eq. (330) can hold is if Xe
is very close to 1. From the definition of Xe , this means that nH = 0, i.e., all hydrogen is ionized.
When the temperature falls markedly below ǫ0 , a significant amount of recombination takes place.
As Xe drops, the rate of recombination also drops, so the equilibrium can no longer be maintained.
In order to track the number density of free electrons accurately, we, again, use the Boltzmann
equation for annihilation, just as we did for the neutron-nucleon ratio.
For the reaction e− + p → H + γ (1=e, 2=p, 3=H, 4=γ) The Boltzmann equation is given by:
#
#
"
"
(0) (0)
ne np
ne np
nH nγ
2
3
(0) (0)
−3 d
nH − ne
ne a
= ne np hσvi (0) (0) − (0) (0) = hσvi
a
(0)
dt
nH nγ
ne np
nH
#
"
ne
me T 3/2 −ǫ0 /T
3
2 2
−3 d
Xe =
Xe nb a
e
nH − Xe nb
= hσvi
a
dt
2π
nb
"
#
me T 3/2 −ǫ0 /T
−3
3 dXe
= nb hσvi (1 − Xe )
e
− Xe2 nb
nH = (1 − Xe ) nb
a nb a
dt
2π
"
#
dXe
me T 3/2 −ǫ0 /T
=⇒
= hσvi (1 − Xe )
e
− Xe2 nb .
(333)
dt
2π
After defining the recombination rate α(2) and the ionization rate β:
α(2) ≡ hσvi,
3/2
me T 3/2 −ǫ0 /T
(2) me T
β ≡ hσvi
e
=α
e−ǫ0 /T ,
2π
2π
(334)
the differential equation for Xe above can be rewritten as
i
dXe h
= (1 − Xe ) β − α(2) Xe2 nb .
dt
(335)
The superscript (2) in the recombination rate α(2) denotes the n = 2 state of the electron. The
ground state (n = 1) leads to production of an ionizing photon, which immediately ionizes another
neutral atom, thus leading to zero net effect — no neutral atoms are formed this way. The only
way for the recombination to proceed is by capturing an electron in one of the excited states of
hydrogen. This rate is well-approximated by
α(2) = 9.78
α2 ǫe 1/2 ǫ0 .
ln
m2e T
T
(336)
The Saha approximation in eq. (330) is a good approximation to the electron-baryon ration Xe
until it falls out of equilibrium. It even correctly predicts the onset of recombination. However,
as we have seen earlier, Saha equation is not valid when equilibrium is not preserved. The correct
description of the evolution of Xe in the presence of reactions leading to the formation of neutral
atoms is accurately described by the full Boltzmann equation given in eq. (335).
We present exact solutions and compare them to Saha equilibrium solutions as we continue our
discussion next time.
87
PHYS 652: Astrophysics
17
88
Lecture 17: Recombination and Dark Matter Production
“New ideas pass through three periods:
• It can’t be done.
• It probably can be done, but it’s not worth doing.
• I knew it was a good idea all along!”
Arthur C. Clarke
The Big Picture: Today we continue discussing the recombination epoch in the early Universe.
We also extend the Boltzmann formalism to the production of dark matter particles.
Recombination (continued)
Just as the neutron-nucleon ratio Xn is important to the abundance of light elements, the abundance of free electrons Xe is of great significance to the observational cosmology. Recombination,
which takes place around z ≈ 1000 directly leads to decoupling of photons from matter. Decoupling
means that the photons stopped scattering off electrons, which become bound to neutral atoms
during this epoch. The mean-free paths of photons become on the order of the size of the Universe, meaning that the Universe has become opaque. The resulting CMB radiation represents a
“snapshot” of the Universe at the time of the “last scatter”.
Roughly speaking, decoupling occurs when the rate of Compton scattering of photons off electrons becomes smaller than the expansion rate of the Universe. The scattering rate is
ne σT = Xe nb σT ,
(337)
where σT = 0.665 × 10−24 cm2 is the Thomson cross-section, and we continue to ignore contribution
of 4 He, by approximating ne + nH ≈ nb . The ratio of the baryon density to the critical density is
ρb
mp n b
Ωb ≡
=
ρcr0 = 1.87 × 10−29 h2 g cm−3
ρcr0
ρcr0
Ωb = Ωb0 a−3
1.87 × 10−29 g cm−3 2
ρcr0
h Ωb0 a−3
Ωb0 a−3 =
=⇒ nb =
mp
1.67 × 10−24 g
=⇒ nb = 1.12 × 10−5 h2 Ωb0 a−3 cm−3 ,
(338)
so that the eq. (337) the becomes
n e σT
= 7.448 × 10−30 cm−1 Xe Ωb0 h2 a−3 .
(339)
From eq. (73), we have
H0 =
h
0.98 × 1010 years
h = 3.09 × 1017 s H0 ,
1 year
3600 × 24 × 365.25 s
= 0.323 × 10−17 s−1 h,
(340)
so that the eq. (339) can be rewritten as
n e σT
= 7.448 × 10−30 cm−1 Xe Ωb0 ha−3
3.09 × 1017 s H0
= 2.3 × 10−12 s cm−1 Xe Ωb0 ha−3 H0 .
88
(341)
PHYS 652: Astrophysics
89
In order to get a dimensionless equation, we multiply the eq. (341) by c/H (but in the equation we
still omit c):
n e σT
H
=
2.3 × 10−12 s cm−1
= 0.069 Xe Ωb0 ha−3
H0
.
H
H0
3 × 1010 cm s−1 Xe Ωb0 ha−3
H
(342)
During the early epochs, the Universe is either radiation- or matter-dominated, which means that
the ratio H0 /H can be solved from the first Friedmann’s equation [eq. (101a)]:
H 2 = H02 ΩT = H02 Ωm0 a−3 + Ωr0 a−4
=⇒
H
Ωr0 −1 1/2
1/2 −3/2
−3
−4 1/2
=⇒
= Ωm0 a + Ωr0 a
= Ωm0 a
a
1+
H0
Ωm0
h
aeq i1/2
H
1/2
,
(343)
= Ωm0 a−3/2 1 +
H0
a
where we have used the results from Appendix to Lecture 9 or eqs. (2.86)-(2.87) in the textbook:
aeq =
Ωr0
4.14 × 10−5
=
.
Ωm0
Ωm0 h2
(344)
Finally, we can rewrite eq. (342) in terms of z (recall a = 1/(1 + z)):
n e σT
H
h
aeq i−1/2
−1/2
= 0.069 Xe Ωb0 ha−3 Ωm0 a3/2 1 +
a
−1/2
4.14 × 10−5
3/2 −1/2
= 0.069 Xe Ωb0 h(1 + z) Ωm0 1 + (1 + z)
Ωm0 h2
Ωb0 h2
0.15 1/2 1 + z 3/2
1 + z 0.15 −1/2
= 113 Xe
,
1+
0.02
Ωm0 h2
1000
3600 Ωm0 h2
(345)
where the constants have been normalized to the best-fit values obtained from observations. When
the free electron fraction Xe drops below ≈ 10−2 , photons decouple from matter. This happens
before the recombination is over, i.e., before the electron fraction Xe levels off below 10−3 .
Even if the Universe remained ionized throughout its history, at some point photons would
decouple from baryons. This can be easily seen from the eq. (345), if we set Xe = 1 (i.e, all
electrons are free). Then, after some algebra, we arrive at
1 + zdecouple = 43
0.02
Ωb0 h2
2/3 Ωm0 h2
0.15
1/3
,
(346)
which, if the terms in parenthesis are taken to be equal to one, corresponds to zdecouple = 42, i.e.,
t ≈ 60 million years.
Recombination timeframe. We can compute when the recombination took place, by computing
how old the Universe was at z ≈ 1000 (see Table 5):
Z ∞
dz̃
1
p
,
(347)
t(z) =
5
H0 z
Ωm0 (1 + z̃) + Ωr0 (1 + z̃)6 + Ωde0 (1 + z̃)2
which gives t(1000) ≈ 440, 000 years (for h = 0.72, Ωm0 = 0.28, Ωr0 = 4.15 × 105 h−2 , Ωde0 = 0.72).
89
PHYS 652: Astrophysics
90
Figure 28: Free electron fraction Xe as a function of redshift. Recombination takes place abruptly at about
z ≈ 1000, which corresponds to T ≈ 0.25eV. The Saha approximation in eq. (330) is a correct description
during equilibrium and accurately identifies the onset of recombination, but not the long-term behavior, for
which the full Boltzmann equation is necessary. (Here Ωb0 = 0.06, Ωm0 = 1, h = 0.5.)
Earlier (Appendix to Lecture 9 or eq. (2.87) in the textbook), we have derived that the Universe
made a transition from radiation- to matter-dominated at about zeq = 2.43 × 104 Ωm0 h2 ≈ 3500,
which corresponds to when the Universe was about 50, 000 years old. This means that the recombination happened during the matter-dominated epoch.
Structure formation. Recombination was followed by the dark ages during which the baryonic
matter was neutral. It is during this time that the first structures in the Universe started to form.
Structure formation in the Big Bang model proceeds hierarchically, with smaller structures forming
before larger ones. The first structures to form are quasars, which are thought to be bright, early
active galaxies, and population III stars. Before this epoch, the evolution of the Universe could be
understood through linear cosmological perturbation theory — all structures could be understood
as small deviations from a perfect homogeneous Universe. This is computationally relatively easy
to study. At this point nonlinear structures begin to form, and the computational problem becomes
much more difficult, involving, for example, N-body simulations with billions of particles.
Reionization. Reionization took place when the first objects started to form in the early Universe
energetic enough to ionize neutral hydrogen. As these first objects formed and radiated energy,
the Universe went from being neutral back to being an ionized plasma, between 150 million and
one billion years after the Big Bang (at a redshift 6 < z < 20). When protons and electrons
are separate, they cannot capture energy in the form of photons. Photons may be scattered, but
scattering interactions are infrequent if the density of the plasma is low. Thus, a Universe full of
low density ionized hydrogen will be relatively translucent, as is the case today.
90
PHYS 652: Astrophysics
91
Dark Matter
Earlier, in Lectures 10 and 11, we discussed the evidence for nonbaryonic matter in the Universe,
and came to the general conclusion that the total contribution of the such a matter to the energy
density is Ωdm ≈ 0.3. We also established WIMPs as the leading candidates for the nonbaryonic
dark matter. Even though we do not know yet what these particles are, we do know that if
such particles exist, they were at some point in equilibrium with the rest of the cosmic plasma
at high temperatures of the early Universe. At some point, they experienced “freeze-out” as the
temperature of the Universe dropped below the WIMP’s mass. Had it not been for falling out of
the equilibrium (“freeze-out”), the abundance of the dark matter particles would decay as e−m/T ,
which would lead to their extinction. However, they do freeze out at some point, which is why we
use the Boltzmann equation (instead of its equilibrium version, the Saha equation) to determine
when they froze-out and quantify their relic abundance. The idea is to use the conclusions from
observations and the earlier epochs of the Big Bang (the BBN), such as Ωdm ≈ 0.3, to constrain
the properties of the unknown WIMPs: their mass and cross-section. Putting such constraints on
the WIMPs would be useful in the experimental attempts at their direct detection.
We now consider a generic scenario, in which two heavy WIMPs (denoted as X) annihilate
and produce two light (essentially massless) particles (l). The light particles are assumed to be
(0)
in complete equilibrium to the cosmic plasma, which means nl = nl . This means that in the
reaction X + X → l + l (1=X, 2=X, 3=l, 4=l), there is only one unknown nX , the abundance of
the WIMPs. Again, we use the Boltzmann equation [eq. (280)]:
"
#
nl nl
nX nX
(0) (0)
3
−3 d
nX a
= nX nX hσvi (0) (0) − (0) (0) =
a
dt
nl nl
nX nX
d
(0) 2
a−3
− n2X .
(348)
nX a3 = hσvi nX
dt
As we did before, we continue to “massage” the Boltzmann equation into something mathematically
more elucidating. After recalling that the temperature scales as T ∼ a−1 , we can rewrite the RHS
of the eq. (348) above as:
n n d nX 3 3 d
X
X
−3 3 3 d
3 d
=
a
T
a
T
a
=
T
.
(349)
nX a3 = a−3
a−3
dt
dt T 3
dt T 3
dt T 3
After defining the quantity Y as
Y ≡
nX
,
T3
(350)
we can rewrite the eq. (348) above as
=⇒
(0)


!
(0) 2
n 2
n
dY
X
X
,
= hσviT 6 
−
T3
dt
T3
T3
2
dY
= hσviT 3 YEQ
−Y2 ,
dt
(351)
where YEQ ≡ nX /T 3 . It is, again, beneficial to introduce a new time variable:
x≡
mX
,
T
(352)
where mX is the mass of the WIMP. Again, very high temperatures correspond to x ≪ 1, which
is when the reactions proceed so rapidly to maintain equilibrium Y ≈ YEQ. Since the WIMPs
91
PHYS 652: Astrophysics
92
are relativistic at that time, their equilibrium abundance is given by the m ≪ T portion of the
eq. (276), so
(0)
gX T 3
nX
gX
2
Y ≈ YEQ = 3 = π 3 = 2 ∼ 1.
(353)
T
T
π
For x ≫ 1, the exponent e−x dominates and suppresses the equilibrium abundance YEQ . Eventually,
the WIMPs become so rare due to this suppression that they no longer can find each other fast
enough to maintain the equilibrium abundance. This is when the freeze-out begins.
We rewrite the Boltzmann equation in terms of the new integration variable x:
"
#
Ṫ
−ka−2 ȧ
ȧ
dY dx
dY
dY
dY
dY
dY
− x =−
=
=
x
x
xH
=
=
−1
dt
dx dt
dx
T
dx
ka
dx
a
dx
2
dY H(x = 1)
dY H(x = 1)
3
2
x
=
=
hσviT
Y
−
Y
EQ
dx
x2
dx
x
m3X hσvi T 3 2
x
2
=
hσviT 3 YEQ
−Y2 =
x YEQ − Y 2
3
H(x = 1)
H(x = 1) mX
λ 2
= − 2 Y 2 − YEQ
,
x
=
=⇒
=⇒
dY
dx
dY
dx
(354)
where the ratio of annihilation rate to the expansion rate is given by
λ≡
m3X hσvi
.
H(x = 1)
(355)
In most theories, λ is a constant. Some theories, however, have a temperature-dependent thermallyaveraged cross-section, which leads to a variable λ. This changes the quantitative results slightly,
while the qualitative solutions remain the same.
92
PHYS 652: Astrophysics
18
93
Lecture 18: Dark Matter Particle Production
“The simple is the seal of the true.”
Subrahmanyan Chandrasekhar (on GR)
The Big Picture: Today we finish the discussion of dark matter particle production. Even
though we do not know the mass of the particles, the Boltzmann equation can be used to derive
the relationship between the mass of particles and its present-day abundance.
Dark Matter Particle Production (continued)
The eq. (354) is not analytically tractable, so its solution requires numerical evaluation. However, we can, again, get a good quantitative feel about its behavior through simple analysis of the
orders of magnitude of the terms, as we have done earlier. When x ∼ 1, the left-hand
side
of the
2
2
eq. (354) is on the order of Y , while the right hand side is on the order of λ Y − YEQ . Since
λ is quite large, the equality is maintained only with Y ≈ YEQ . Later, as temperature T drops, x
increases, and the equilibrium YEQ is no longer a good approximation to Y . After the freeze-out, Y
is much larger than YEQ , as particles are not able to annihilate fast enough to maintain equilibrium.
Therefore, at later times
dY
λY 2
≃− 2 ,
for x ≫ 1.
(356)
dx
x
This equation can be integrated analytically from the epoch of freeze-out x = xf , Y = Yf until
very late times x = ∞, Y = Y∞ to obtain
Z
Y∞
Yf
dY
Y2
≃
=⇒
−
Z
∞
xf
λ dx
x2
=⇒
1
λ
1
=
.
−
Y∞ Yf
xf
1 Y∞
λ ∞
− = Y Yf
x xf
=⇒
−
1
λ
1
+
≃−
Y∞ Yf
xf
(357)
Generally, Y at freeze-out Yf is much larger than Y∞ , so 1/Y∞ ≫ 1/Yf , and the term 1/Yf can be
neglected. Then a simple analytic approximation for Y∞ is
Y∞ ≃
xf
.
λ
(358)
This approximation still depends on the freeze-out temperature xf which is yet to be determined.
Typically, xf ∼ 10.
Figure 29 shows that the numerical solution to eq. (354) for two different values of λ. The
equilibrium approximation is valid to about m/T ≈ 10, after which the Boltzmann non-equilibrium
solution levels off. The rough approximation of Y∞ ≈ 10/λ is a decent approximation for the
relic abundance. The particles with the larger cross-sections (and consequently, by definition,
larger λ) freeze-out later, because the bigger the cross-section, the longer they will continue to
interact. Furthermore, this prolonged annihilation results in a lower relic abundance. The inset in
Fig. (29) shows that the distinction between BE, FD and Boltzmann statistics is only important for
temperatures above the particle’s mass. Since the freeze-out happens at temperatures significantly
below the particle’s mass (recall the delay in freezing-out), the use of Boltzmann statistics is
justified. At the freeze-out, the dark matter particle density scales as ρX ∝ a−3 . This means that
93
PHYS 652: Astrophysics
94
Figure 29: Abundance of heavy stable particle as the temperature drops beneath its mass. Dashed line is
equilibrium abundance. Two different solid curves show heavy particle abundance for two different values
of λ, the ratio of the annihilation rate to the Hubble rate. Inset shows that the difference between quantum
statistics and Boltzmann statistics is important only at temperatures larger than the mass.
its energy density today is equal to
ρX (a0 )a30
=
ρX (a1 )a31
=⇒
ρX (a0 ) = ρX (a1 )
a1
a0
3
= mX nX (a1 )
a1
a0
3
,
(359)
where a1 corresponds to the time when Y has reached its asymptotic value of Y∞ . The number
density at that time is [from the definition Y ≡ nX /T 3 in eq. (350)] nX = Y∞ T13 , so
3
a1 T1 3
a1
= mX Y∞ T03
.
(360)
ρX (a0 ) ≡ ρX0 = mX Y∞ T13
a0
a0 T0
At the first glance, we may expect that the ratio in the parenthesis is unity because we have used
T ∝ a−1 . However, this is only true after the annihilations of many particles in the primordial soup
has been completed — such annihilation raise the temperature of the Universe. (We have already
talked about an example of this: annihilation of electrons and positrons heats up photons, while
neutrinos, which have decoupled shortly before that remain unaffected.) This means that the ratio
(a1 T1 )/(a0 T0 ) has to be computed from the entropy density argument, and the fact that it scales
as a−3 , as we have computed earlier (Lecture 9):
ρ+P
,
radiation-dominated: P = 13 ρ
T
2 π
4
4ρ
4 g⋆ 30 T
4π 2
s =
=
=
g⋆ T 3
eq. (306)
3T
3
T
90
4π 2
4π 2
g⋆ (a1 )T13 a31 =
g⋆ (a0 )T03 a30
s(a1 )a31 = s(a0 )a30
=⇒
90
90
a1 T1 3 g⋆ (a0 )
=⇒
,
(361)
=
a0 T0
g⋆ (a1 )
s
≡
94
PHYS 652: Astrophysics
95
where g⋆ (a0 ) was computed earlier (at T ≈ 0.1 MeV, after the annihilation of electrons and
positrons) to be g⋆ (a0 ) = 3.36 [eq. (315)]. The effective number of relativistic particles at high
temperatures when Y → Y∞ then becomes [eq. (306)]
X
g⋆ (a1 ) =
gi +
i=bosons
7
8
X
gi ,
(362)
i=fermions
where the constituent particles are: quarks (g = 5 × 3 × 2 = 30 for 5 least massive types — up,
down, strange, charmed and bottom; top quark is too heavy to be around at these temperatures
since mtop ≈ 176 GeV) — with 3 colors and 2 spin states; anti-quarks (also g = 30); leptons
(g = 6 × 2 = 12 for 6 types — e, νe , µ, νµ , τ, ντ — and 2 spin states; anti-leptons (also g = 12);
photons g = 2; and gluons g = 8 × 2 for 8 possible colors and 2 spin states. The grand total for the
effective number of relativistic particles is then
7
g⋆ (a1 ) = 2 + 16 + (30 + 30 + 12 + 12) = 91.5.
8
(363)
Finally, the ratio [(a1 T1 )/(a0 T0 )]3 is
a1 T1
a0 T0
3
=
1
1
3.36
≈
≈
91.5
27
30
(364)
to be consistent with the textbook. The number density of the dark matter particles today is then
ρX0 ≈ mX Y∞
T03
.
30
(365)
The fraction of critical density today due to the dark matter particles X is
xf T03
ρX0
T3
≈ mX Y ∞ 0 ≈ mX
ρcr
30 ρcr
λ 30 ρcr
3
H(x = 1)xf T0
H(x = 1)xf T03
≈ mX
=
.
3
mX hσvi 30 ρcr
m2X hσvi 30 ρcr
ΩX0 ≡
eq. (358)
eq. (355)
(366)
But, from eqs. (306) and (308), we have
s
r
r
r
2
8πGg⋆ (x) π30 T 4
8πGρ
4π 3 Gg⋆ (x) 2
4π 3 Gg⋆ (x)
=
=
T =
mX 2 x−2
H(x) =
3
3
45
45
r
4π 3 Gg⋆ (x = 1)
=⇒
H(x = 1) =
mX 2 ,
(367)
45
so that the eq. (366) now reads
ΩX0 =
r
4π 3 Gg⋆ (x = 1) xf T03
.
45
30hσviρcr
(368)
The fraction of critical density due to dark matter today, ΩX0 , depends implicitly on the mass of
the X particle through the freeze-out time xf and the effective number of relativistic particles at
x = 1 g⋆ (x = 1). The explicit dependence is only on the cross-section.
95
PHYS 652: Astrophysics
96
Now we use the result obtained from the observations and the predictions of the BBN, that
ΩX0 = Ωdm0 ≈ 0.3. We normalize the eq. (368) to the most likely values of quantities included
(observations and predictions):
−2
ΩX0 = 0.3h
x g (x = 1) 1/2 10−39 cm2
f
⋆
.
10
100
hσvi
(369)
It is a good sign that the “best-fit” cross-section is on the order of 10−39 cm2 , because there are
several theories which predict particles with cross-section that small.
The theory which, at least at present, appears most likely to feature a WIMP dark matter
particle is supersymmetry. Supersymmetry claims that all the particles in the standard model have
their “superpartners”, which are too massive to have yet been observed. Of those, only the neutral
and stable particles are viable candidates for as dark matter constituents, because the dark matter
is not affected by weak interactions and it has been around since the early times of the Universe
(if it were not stable, it would have annihilated away by now). The first of these criteria restricts
the dark matter particle to be the partner of one of the neutral particles, such as Higgs or the
photon. The second restriction requires the dark matter particle to be the lightest supersymmetric
particle of these, because heavier particles decay into lighter ones over time (and hence would not
be stable).
A great deal of effort has been expended in search of the dark matter particles. Even though
the numerous ongoing experiments have not yet directly detected the dark matter particles, they
are successfully restricting the properties of such a particle. They restrict regions in the scattering
cross-section versus mass graph where dark matter particles may exist (Fig. 30).
Figure 30: Constraints on supersymmetric dark matter particle. Regions above the solid curves are
excluded, while filled region is reported detection by DAMA. Note the limits on the cross-section are in units
of picobarns (1 picobarn = 10−36 cm2 ).
96
PHYS 652: Astrophysics
19
97
Lecture 19: Cosmic Microwave Background Radiation
“Observe the void — its emptiness emits a pure light.”
Chuang-tzu
The Big Picture: Today we are discussing the cosmic microwave background (CMB) radiation,
the “snapshot” of the Universe at its infancy — when it was only about a few hundred thousand
years old. We present the spectrum of the radiation and analyze its main features.
Importance of the CMB Radiation
The CMB radiation is a prediction of Big Bang theory. According to the Big Bang theory, the
early Universe was made up of a hot plasma of photons, electrons and baryons. The photons were
constantly interacting with the plasma through Thomson scattering. As the Universe expanded,
adiabatic cooling caused the plasma to cool until it became favorable for electrons to combine
with protons and form hydrogen atoms. This happened at around 3,000 K or when the Universe
was approximately 380,000 years old (z ≈ 1100). At this point, the photons scattered off the
now neutral atoms and began to travel freely through space. This process is called recombination
or decoupling (referring to electrons combining with nuclei and to the decoupling of matter and
radiation respectively).
The photons have continued cooling ever since; they have now reached 2.725 K and their temperature will continue to drop as long as the Universe continues expanding (Tγ ∝ a−1 ). Accordingly,
the radiation from the sky that we measure today comes from a spherical surface, called the surface of last scattering. This represents the collection of points in space (currently around 46 billion
light years from the Earth) at which the decoupling event happened long enough ago (less than
400,000 years after the Big Bang, 13.7 billion years ago) that the light from that part of space is
just reaching observers.
The Big Bang theory suggests that the CMB radiation fills all of observable space, and that
most of the radiation energy in the Universe is in the cosmic microwave background, which makes
up a fraction of roughly 5 × 10−5 of the total density of the Universe.
Two of the greatest successes of the Big Bang theory are its prediction of its almost perfect
black-body spectrum and its detailed prediction of the anisotropies in the CMB radiation. The
recent Wilkinson Microwave Anisotropy Probe (WMAP) has precisely measured these anisotropies
over the whole sky down to angular scales of 0.2 degrees. These can be used to estimate the
parameters of the standard ΛCDM model of the Big Bang (recall Article 3). Some information,
such as the shape of the Universe, can be obtained directly from the CMB radiation, while others,
such as the Hubble constant, are not constrained and must be inferred from other measurements.
Black-body spectrum. The function describing the distribution of photons radiated by a blackbody is simply given by the equilibrium BE equilibrium statistics, after taking E = p = ~ν = ν:
f (ν) =
1
e−ν/T
−1
(370)
and the corresponding intensity of the black-body spectrum is given by the Poisson distribution
I(ν) =
4πν 3
.
e−ν/T − 1
The excellent agreement between theoretical spectrum in eq. (371) is shown in Fig. 31.
97
(371)
PHYS 652: Astrophysics
98
Figure 31: Intensity of CMB radiation as a function of a wavenumber from FIRAS instrument on COBE
satellite. The distinction between the theoretical prediction and the measured values are all smaller than
the thickness of the line.
Systematic Bias: The Dipole Anisotropy
If CMB radiation looks like a perfect black-body radiation to one observer, it should not look like
a perfect black-body to other observers who are moving relative to the first observer. The radiation
should be Döppler shifted because of the observer’s motion. The observed radiation should appear
somewhat bluer (hotter) in the direction in which the observer is moving, and somewhat redder
(cooler) in the opposite direction. The relativistic Döppler effects due to the motion of our frame of
reference in relation to the frame of reference in which the CMB radiation is a perfect black-body
need to be accounted for before one can successfully analyze the CMB spectrum.
Relativistic Döppler shift. Assume the observer is moving away from each other with a relative
velocity v. Let us derive the SR relation connecting the frequencies of light emitted in one (denoted
with subscript 1) and received in another reference system (subscript 2), moving away at speed v.
Suppose one wavefront arrives at the observer. The next wavefront is then a distance λ = c/ν1
away from him/her (where λ is the wavelength, ν1 the frequency of the wave emitted, and c is the
speed of light). Since the wavefront moves with velocity c and the observer escapes with velocity
v, the time observed between crests is
t=
λ
=
c−v
λ
c
λ
λ
=
− λv
c
λ
1
1
=
− vc λc
1 − vc ν1
ν1 =
c
.
λ
(372)
However, due to the relativistic time dilation, the observer will measure this time to be
t2 =
1
t
,
=
γ
γ 1 − vc ν1
98
(373)
PHYS 652: Astrophysics
99
p
where γ = 1/ 1 − v 2 /c2 , so the observed frequency is
ν2 =
v
1
ν1 ,
=γ 1−
t2
c
(374)
and the corresponding relativistic Döppler shift
1 − vc
ν2
v
.
=q
=γ 1−
ν1
c
v2
1 − c2
(375)
In a more general case, when the motion of the two reference frames is given by a vector n̂, such
that vn̂ = v cos θ, the equation for the relativistic Döppler shift becomes
1 − vc cos θ
1 − vn̂
ν2
c
q
= q
.
=
2
ν1
v2
1 − c2
1 − vc2
(376)
However, we are moving in relation to the reference frame at rest, so we are ν1 ≡ νo and observing
light which in the reference frame “at rest” has frequency ν2 ≡ νe , so
q
2
1 − vc2
νo
=
.
(377)
νe
1 − vc cos θ
This means that the temperature observed in the direction θ, T (θ), is given in terms of the average
temperature hT i as
q
2
1/2 −1
1 − vc2
v
v2
T (θ)
1
−
=
cos
θ
=
1
−
hT i
1 − v cos θ
c2
c
c 2
v
v2
1v
2
+ ...
1 + cos θ + 2 cos θ + ...
≈
1−
2 c2
c
c
2
v
v
1
≈ 1 + cos θ + 2 cos2 θ −
+ ...
(378)
c
c
2
The motion of the observer (us) gives rise to both a dipole and other, higher order corrections. The
observed dipole anisotropy, first detected in 1960’s, implies that
~v⊙ − ~vCMB = 370 ± 10 km/sec
φ = 267.7 ± 0.8o ,
towards
θ = 48.2 ± 0.5o ,
(379)
where θ is the colatitude (polar angle) and it is in the range 0 ≤ θ ≤ π and φ is the longitude
(azimuth) and it is in the range 0 ≤ φ ≤ 2π. Therefore θ = 0 at the North Pole, θ = π/2 at the
Equator and θ = π at the South Pole.
Allowing for the Sun’s motion in the Galaxy and the motion of the Galaxy within the Local
Group, this implies that the Local Group is moving with
~vLG − ~vCMB ≈ 600 km/sec
towards
φ = 268o ,
θ = 27o .
(380)
This “peculiar” motion is subtracted from the measured CMB radiation, after which the intrinsic
anisotropy is isolated (Fig. 32), and revealed to be about few parts in 105 . Even though minuscules,
these primordial perturbations provided seeds for the structure of the Universe.
99
PHYS 652: Astrophysics
100
Figure 32: The CMB radiation temperature fluctuations from the 5-year WMAP data seen over the full
sky. The average temperature is 2.725K, and the colors represents small temperature fluctuations. Red
regions are warmer, and blue colder by about 0.0002 K.
Angular Power Spectrum
We now describe the technique which allows quantification of small-scale fluctuations in the
CMB radiation field. First, define the normalized temperature Θ in direction n̂ on the celestial
sphere by the deviation from the average:
Θ(n̂) =
∆T
,
hT i
(381)
Second, we consider multipole decomposition of Θ(n̂) in terms of spherical harmonics Ylm :
Θ(n̂) = Θ(θ, φ) =
∞ X
l
X
Θlm Ylm (θ, φ)
(382)
l=0 m=−l
with
Θlm =
Z
Integral above is over the entire sphere and
s
Ylm (n̂) = Ylm (θ, φ) =
∗
Θ(n̂)Ylm
(n̂)dΩ.
(2l + 1) (l − m)! m
P (cos θ)eimφ ,
4π (l + m)! l
(383)
(384)
with Plm (x) the associated Legendre functions:
Plm (x) ≡
The basis functions are orthonormal:
Z π Z
θ=0
l
(1 − x2 )m/2 dm+l
x2 − 1 .
l
m+l
2 l!
dx
2π
φ=0
Ylm Yl∗′ m′ dΩ = δll′ δmm′ ,
100
(385)
(386)
PHYS 652: Astrophysics
101
Figure 33: Power spectrum of CMB radiation.
where δnn′ is the Kronecker delta function (=1 when n = n′ , =0 otherwise), and dΩ = sin θdφdθ.
The field of Gaussian random fluctuations is fully characterized by its power spectrum Θ∗lm Θl′ m′ .
The order m describes the angular orientation of a fluctuation mode, and the degree (multipole)
l determines its characteristic angular size. Therefore, in a Universe with no preferred direction
(isotropic), we expect that the power spectrum to be independent of m. Also, in a Universe which is
the same from point to point (homogeneous), we expect that the power spectrum to be independent
of l. Finally, we define the angular power spectrum Cl to be
Cl = hΘ∗lm Θl′ m′ i = δll′ δmm′ Cl .
(387)
The brackets denote the average over the skies with the same cosmology. The best estimate of Cl
is then from the average over m.
Cosmic variance. From eq. (382), we can see that each of the multipoles l is determined by
harmonics with m ∈ [−l, l], a total of (2l + 1). This poses a fundamental limit in determining the
power. This is called the cosmic variance:
r
2
∆Cl
.
(388)
=
Cl
2l + 1
The cosmic variance states that it is only possible to observe part of the Universe at one particular
time, so it is difficult to make statistical statements about cosmology on the scale of the entire
Universe.
The standard Big Bang model features an epoch of cosmic inflation. In inflationary models,
the observer only sees a tiny fraction of the whole Universe. So the observable Universe (the socalled particle horizon of the Universe) is the result of processes that follow some general physical
101
PHYS 652: Astrophysics
102
laws, including quantum mechanics and GR. Some of these processes are random: for example, the
distribution of galaxies throughout the Universe can only be described statistically and cannot be
derived from first principles.
This raises philosophical problems: suppose that random physical processes happen on length
scales both smaller than and bigger than the horizon. A physical process (such as an amplitude of a
primordial perturbation in density) that happens on the horizon scale only gives us one observable
realization. A physical process on a larger scale gives us zero observable realizations. A physical
process on a slightly smaller scale gives us a small number of realizations. Therefore, even if the
bit of the Universe observed is the result of a statistical process, the observer can only view one
realization of that process, so our observation is statistically insignificant for saying much about
the model, unless the observer is careful to include the variance.
On small sections of the sky where its curvature can be neglected, the spherical harmonic
analysis becomes ordinary Fourier analysis in two dimensions. In this limit l becomes the Fourier
wavenumber. Since the angular wavelength θ = 2π/l, large multipole moments corresponds to
small angular scales with l ∼ 102 representing degree scale separations. The power spectrum is
traditionally displayed in literature as (the power per logarithmic interval in l)
∆T 2 ≡
l(l + 1)
2
Cl TCMB
,
2π
(389)
where TCMB is the black-body temperature of the CMB radiation. Figure 33 shows the measurements of this quantity by several experiments.
The power spectrum shown in Fig. 33 begin at l = 2 and exhibit large errors at low multipoles.
The reason is that the predicted power spectrum is the average power in the multipole moment
l an observer would see in an ensemble of Universes. However a real observer is limited to one
Universe and one sky with its one set of Θlm ’s, 2l + 1 numbers for each l. This is particularly
problematic for the monopole and dipole (l = 0, 1). If the monopole were larger in our vicinity
than its average value, we would have no way of knowing it. Likewise for the dipole, we have no way
of distinguishing a cosmological dipole from our own peculiar motion with respect to the CMB rest
frame. Nonetheless, the monopole and dipole are of the utmost significance in the early Universe.
It is precisely the spatial and temporal variation of these quantities, especially the monopole, which
determines the pattern of anisotropies we observe today.
102
PHYS 652: Astrophysics
20
103
Lecture 20: Cosmic Microwave Background Radiation
— continued
“Innocent light-minded men, who think that astronomy can be learnt by looking at the stars
without knowledge of mathematics will, in next life, be birds.”
Plato
The Big Picture: Today we are finishing the discussion of the CMB radiation, including the
analysis of the acoustic peaks and effects leading to anisotropies.
Scales in the Angular Power Spectrum
The angular power spectrum quantifies the correlation of different parts of the sky we observe
separated by an angle θ. This angle is related to a multipole l of the expansion as θ = 180o /l.
The size of the observable Universe (horizon) at the time of decoupling corresponds to about 1o on
the sky today (l ≈ 200). The part of the angular spectrum which correlates portions on the sky
separated by angles appreciably larger than the size of the horizon at decoupling (corresponding
to l . 20) represent initial conditions: these parts of the Universe have not been in causal contact
since (before) inflation (Fig. 34). The other part of the angular spectrum — at high l values —
feature peaks corresponding to acoustic oscillations (Fig. 35). The positions and magnitudes of the
peaks of acoustic oscillations contain fundamental properties about the geometry and structure of
the Universe.
Figure 34: CMB horizon (Courtesy of W. Hu)
103
PHYS 652: Astrophysics
104
Figure 35: CMB angular power spectrum (Hu & White, Scientific American, February 2004).
Acoustic Oscillations
In the early Universe before decoupling, rapid scattering couples photons and baryons into a
plasma which behaves as perfect fluid. Initial quantum overdensities create potential (gravitational)
wells — inflationary seeds of the Universe’s structure. Infall of the fluid into the potential wells is
resisted by its pressure, thus forming acoustic oscillations: periodic compression (overdensities in
the fluid; hot spots) and rarefications (underdensities; cold spots). These acoustic oscillations of
the early Universe are frozen at recombination and give the CMB spectrum a unique signature.
The CMB data reveals that the initial inhomogeneities in the Universe were small. An overdense
regions would grow by gravitationally attracting more mass, but only after the entire region is in
causal contact. This means that only regions which are smaller than the horizon at decoupling had
time to compress before then. Regions which are sufficiently smaller than the horizon had enough
time to compress gravitationally until the outward-acting pressure halted the compression via
Thomson scattering, and possibly even go through a number of such acoustic oscillations. Therefore,
perturbations of particular sizes may have gone through: (i) one compression (fundamental wave);
(ii) one compression and one rarefication (first overtone); (iii) one compression, one ramification
and one compression again (second overtone); etc... (Fig. 36).
The most pronounced temperature variation in the CMB radiation will be due to the fundamental sound wave. This is because the portions of the sky separated by the scale equal to the
horizon at decoupling — corresponding to the fundamental sound wave — will be completely out
of phase.
Consider a standing
√ wave Ak (x, t) ∝ sin(kx) cos(ωt), going through space at the speed of sound
(in plasma vs ≈ c/ 3), with the frequency ω and wave number k, related by ω = kvs . The
displacement — and hence the correlation in temperature — will be maximal at the decoupling
time tdec for ωtdec = kvs tdec = π, 2π, 3π... The subsequent peaks in the power spectrum represent
104
PHYS 652: Astrophysics
105
Figure 36: Sound waves in a pipe (top) and acoustic waves in the early Universe (Hu & White, Scientific
American, February 2004).
105
PHYS 652: Astrophysics
106
the temperature variations caused by overtones. The series of peaks strongly supports the theory
that inflation all of the sound waves at the same time. If the perturbations had been continuously
generated over time, the power spectrum would not be so harmoniously ordered.
Dampening of the overtones. Both ordinary matter and dark matter supply mass to the
primordial plasma and enhance the gravitational pull, but only ordinary matter undergoes the
sonic compressions and rarefications (dark matter has decoupled from the plasma at a much earlier
time). At recombination, the fundamental wave is frozen in a phase where gravity enhances its
compression of the denser regions of plasma (Fig. 37). The first overtone, which corresponds to
scales half of the fundamental wavelength, is caught in the opposite phase (Fig. 37, bottom panel)
— gravity is attempting to compress the plasma while the plasma pressure is trying to expand it.
As a consequence, the temperature variations caused by this overtone (and all subsequent ones)
will be less pronounced than those caused by the fundamental wave (fundamental peak).
This dampening of the magnitudes of the overtones allows for quantification of the relative
strength of gravity and radiation pressure in the early Universe.
Figure 37: Gravitational modulation: gravity and acoustic oscillation work in phase in the first peak (top);
gravity and acoustic oscillations attenuate each other’s effects (Hu & White, Scientific American, February
2004).
106
PHYS 652: Astrophysics
107
Dampening of the small-scale acoustic waves. The theory of inflation also predicts that the
sound waves should have nearly the same amplitude on all scales. The power spectrum, however,
shows a sharp drop-off in magnitude of temperature variations after the third peak. This is due
to the dissipation of the sound waves with short wavelengths: sound is carried by oscillation of
particles in gas or plasma, a wave cannot propagate if its wavelength is shorter than the typical
distance traveled by particles between collisions.
Polarization of the CMB
Researchers have recently detected that the CMB radiation is polarized. Careful and precise
study of this area is believed to be the most promising avenue toward discovering new fundamental
physics.
The polarization, unlike the temperature anisotropies is only generated by scattering. When
we observe the polarization we are looking directly at the surface of the last scattering of photons.
It is therefore our most direct probe of the Universe at the epoch of recombination as well as the
later reionization of the Universe by the first stars. The latter can really only be probed by the
CMB through its polarization.
Figure 38: Generation of polarization: unpolarized but anisotropic radiation incident on an electron produces radiation. Intensity is produced by line thickness. To an observer looking along the direction of the
scattered photons (z), the incoming quadrupole pattern produces linear polarization along the y-direction.
The polarization, which carries directional information on the sky (as a tensor field), contains
more information than the temperature field. Measurements of the polarization power spectrum
can greatly enhance the precision with which one can extract the physical parameters associated
with acoustic oscillations.
Furthermore, the polarization through its directional information provides a means of isolating
the gravitational waves predicted by models of inflation. As such polarization provides our most
direct window onto the very early Universe and the origin of all structure in the Universe.
Origin of polarization. Quadrupole anisotropy polarizes the anisotropic (but unpolarized) radiation (Fig. 38). The CMB radiation is polarized by Thomson scattering in the following manner.
Consider incoming radiation from the left being Thomson-scattered by 90o out of the screen. Since
107
PHYS 652: Astrophysics
108
light cannot be polarized along its direction of motion, only one linear polarization gets Thomsonscattered. However, there is nothing special about light coming in from the left: if the light also
comes from the top, the resulting scattered radiation will have both polarization states. The degree
of polarization will depend on intensity of the incoming radiation, so the 90o anisotropies in the
radiation will result in linear polarization (Fig. 38).
Shift to high l. Because the polarization arises from scattering, which in turn dilutes the
quadrupole, the anisotropies in polarization are much weaker than anisotropies in temperature.
With each scatter that the photon experiences on as it approaches equilibrium, the polarization
is reduced. The remaining polarization is a direct result of the stoppage of scattering. The local
quadrupole on the scales which are much larger than the mean-free path of photons (for instance,
the scale of the horizon) will be diluted by multiple scattering, and therefore not dominant in the
spectrum. The peak of the spectrum is shifted toward smaller scales (large l values), where the
local quadrupole is close to the mean-free path of photons.
Physical Effects Affecting the CMB Radiation
The Sunyaev–Zel’dovich Effect. The Sunyaev-Zel’dovich (SZ) effect refers to the Compton
scattering of CMB photons by hot, ionized gas in clusters of galaxies. It was first predicted in
1969 by Sunyaev and Zel’dovich. The effect is a foreground anisotropy to the CMB. The SZ effect
causes a “hotspot” in the CMB due to the kinetic SZ effect (due to the bulk motion of the cluster
with respect to the CMB) and a noticeable change in the shape of the CMB spectrum due to the
thermal SZ effect.
The SZ effect is important to the study of cosmology and the CMB for two main reasons:
1. the observed “hotspots” created by the kinetic effect will distort the power spectrum of CMB
anisotropies. These need to be separated from the primary anisotropies in order to probe
properties of inflation.
2. The thermal SZ effect can be measured and combined with X-ray observations in order to
determine values of cosmological parameters, in particular the present value of the Hubble
rate H0 .
Interaction between photons of the CMB and charged particles they encounter as they pass
through the hot, ionized gas in clusters of galaxies causes them to scatter, thus polarizing the CMB
radiation across wide swaths of the sky. Observations of this large-angle polarization by the WMAP
spacecraft imply that about 17 percent of the CMB photons were scattered by a thin fog of ionized
gas a few hundred million years after the Big Bang.
This relatively large fraction is perhaps the biggest surprise from the WMAP data. Cosmologists
had previously theorized that most of the Universes hydrogen and helium would have been ionized
by the radiation from the first stars, which were extremely massive and bright. (This process is
called reionization because it returned the gases to the plasma state that existed before the emission
of the CMB.) But the theorists estimated that this event occurred nearly a billion years after the
Big Bang, and therefore only about 5 percent of the CMB photons would have been scattered.
WMAPs evidence of a higher fraction indicates a much earlier reionization and presents a challenge
for the modeling of the first rounds of star formation. The discovery may even challenge the theory
of inflations prediction that the initial density fluctuations in the primordial Universe were nearly
the same at all scales. The first stars might have formed sooner if the small-scale fluctuations had
higher amplitudes. The WMAP data also contain another hint of deviation from scale invariance
that was first observed by the COBE satellite. On the biggest scales, corresponding to regions
108
PHYS 652: Astrophysics
109
stretching more than 60 degrees across the sky, both WMAP and COBE found a curious lack of
temperature variations in the CMB. This deficit may well be a statistical fluke: because the sky is
only 360 degrees around, it may not contain enough large-scale regions to make an adequate sample
for measuring temperature variations. But some theorists have speculated that the deviation may
indicate inadequacies in the models of inflation, dark energy or the topology of the Universe.
Sachs-Wolfe Effect. At last scattering the baryons and photons decouple and the photons suddenly find themselves free to travel in straight paths through the Universe. However, the baryons
are clustered together in gravitational potential wells prior to last scattering. Since the photons are
tightly coupled to the baryons before last scattering, they are confined to potential wells too. Thus
the photons have to climb out of potential wells when they are suddenly freed at last scattering.
This climb requires some energy and the photons are therefore redshifted. The subsequent rise at
low l in the CMB power spectrum is known as the Sachs-Wolfe (SW) effect, and since it is imprinted
on the CMB power spectrum at the time of last scattering, it is considered a primary anisotropy.
This effect is the predominant source of fluctuations in the CMB for angular scales above about
ten degrees — the regions in the early Universe which were too big to undergo acoustic oscillations.
Integrated Sachs-Wolfe Effect. The Integrated Sachs-Wolfe (ISW) effect is also caused by
gravitational redshift, however here it occurs between the surface of last scattering and the Earth,
so it is not a fundamental part of the CMB.
The ISW effect can arise after last scattering as the photons free stream through the Universe.
Although the photons are no longer tightly coupled to the baryons, they can still slip into potential
wells and have to climb back out. When they fall in, the photons gain some energy (are blueshifted)
and when they climb back out, they are redshifted. Assuming that the depth of the potential well
remains constant while the photon traverses it, the redshift exactly cancels the blueshift. No trace
of the photon’s passage through the potential well remains, assuming that both sides of the dip are
the same height and no energy is dissipated. Suppose, however, that the potential well through
which the photon passes either decays or deepens while the photon is inside. Then its redshift and
blueshift will not exactly cancel; instead the photon gains or loses some energy (respectively) from
its passage through the potential well.
There are two main contributions to the integrated effect. The first occurs shortly after photons
leave the last scattering surface, and is due to the evolution of the potential wells as the Universe
changes from being dominated by radiation to being dominated by matter. The second, sometimes
called the ‘late-time integrated Sachs-Wolfe effect’, arises much later as the evolution starts to
feel the effect of the cosmological constant (or, more generally, dark energy), or curvature of the
Universe if it is not flat. The latter effect has an observational signature in the amplitude of the
large scale perturbations of the CMB and their correlation with the large scale structure.
The primary anisotropies (SW) on the CMB power spectrum tell us about the initial conditions
of the photons, and any passage through a potential well that results in a net energy loss or gain
changes these conditions and leaves a mark on the spectrum — the secondary anisotropy (ISW).
Determining the Cosmic Parameters from CMB Radiation
Baryonic matter content (Ωb ). Relative magnitudes of the first overtone to the fundamental
peak in the power spectrum of the CMB radiation enables precise quantification of relative strengths
of gravity and radiation in the early Universe. It has been determined that the energy in baryons
was about the same as the energy in CMB photons at the time of decoupling, which — through
scaling which we have done in previous classes (recall ργ ∝ a4 ) — puts the baryonic content of the
Universe at about 5 percent. This is in excellent agreement with the predictions of the BBN.
109
PHYS 652: Astrophysics
110
Dark energy (ΩΛ ). Because dark energy accelerates the expansion of the Universe, it weakens
the gravitational-potential wells associated with galaxy clustering (ISW effect). These effects can
are detected and quantified at the large-scale variations of the CMB radiation (low l values).
Hubble rate (H0 ). SZ effect is used to measure the present-day value of the Hubble rate (H0 ).
110
PHYS 652: Astrophysics
21
111
Lecture 21: The Schwarzschild Metric and Black Holes
“All of physics is either impossible or trivial. It is impossible until you understand it, and then it
becomes trivial.”
Ernest Rutherford
The Big Picture: Today we are starting the third (and last) part of the course: black holes,
stars and galaxies. We show that the Einstein’s field equations imply the existence of a space-time
singularity, which we now know as “black holes”.
The Schwarzschild Problem
Shortly after Einstein published his field equations of GR, Karl Schwarzschild solved them to
find the space-time geometry outside a stationary, spherical distribution of matter of mass M .
Since the space outside the distribution is empty, the energy-momentum tensor Tαβ vanishes, so
the Einstein’s field equation becomes:
1
Rαβ − gαβ R = 0,
2
(390)
with an appropriate metric tensor. The appropriate boundary conditions are:
1. metric must match interior metric at the body’s surface;
2. metric must go to the flat (Minkowski) metric far away from the body.
We now solve for the Schwarzschild metric gαβ which solves the Schwarzschild problem. We start
with a general static and isotropic metric:
1. static: both time-independent and symmetric under time reversal
(only time-independent ⇐⇒ stationary);
2. isotropic: invariant under spatial rotations (same in all directions).
The interval satisfying these criteria may be written as
ds2 = −A(r)dt2 + B(r)dr 2 + r 2 dθ 2 + sin2 θdφ2 ,
(391)
where the first two term on the RHS describe radial behavior (isotropy), and the last two the
surface of the sphere (spherical symmetry). It can be expressed in many equivalent forms. One
convenient form is:
ds2 = −eN (r) dt2 + eP (r) dr 2 + r 2 dθ 2 + sin2 θdφ2 ,
(392)
corresponding to the metric tensor
gαβ

−eN (r)
0
0
0


0
eP (r) 0
0
.
=
2


0
0
r
0
2
2
0
0
0 r sin θ

(393)
The Schwarzschild problem reduces to solving for N (r) and P (r) from Einstein’s field equations
and the appropriate boundary conditions.
111
PHYS 652: Astrophysics
112
Solving the Schwarzschild Problem
Earlier we have defined an alternative Lagrangian [eq. 26]:
1
L = gαβ ẋα ẋβ ,
2
(394)
(where dot denotes s-derivative) which for the metric in eq. (393) becomes (x0 → t, x1 → r, x2 → θ,
x3 → φ):
1
1
1
1
L = − eN ṫ2 + eP ṙ 2 + r 2 θ̇ 2 + r 2 sin2 θ φ̇2 ,
(395)
2
2
2
2
This alternative Lagrangian allows us to easily read off Christoffel symbols by comparing it to the
geodesic equation [eq. (31)]:
δ
γ
d2 xν
ν dx dx
+
Γ
= 0,
(396)
γδ
ds2
ds ds
which we can combine to obtain the Riemann and Ricci tensors. Let us solve the Lagrange equations
d ∂L
∂L
−
= 0,
α
∂x
ds ∂ ẋα
for each of the components of the space-time (′ denotes r-derivative):
• t-component:
=⇒
d ∂L
∂L
−
∂t
ds ∂ ṫ
d
−eN ṫ
0−
ds
dr
dN
ṫ + eN ẗ
eN
dr ds
eN ẗ + N ′ ṫṙ
d2 t
dt
dr
′
+N
2
ds
ds
ds
= 0
= 0
= 0
= 0
= 0.
(397)
After comparing it to eq. (396), we obtain
d2 t
0
0
+
Γ
+
Γ
01
10
ds2
dt
ds
dr
ds
= 0,
(398)
which means that (because of symmetry of the Christoffel symbols: Γαβγ = Γαγβ )
1
Γ001 = Γ010 = N ′ ,
2
while other Γ0αβ symbols vanish.
112
(399)
PHYS 652: Astrophysics
113
• r-component:
d ∂L
∂L
−
= 0
∂r
ds ∂ ṙ
1
d P 1
e ṙ = 0
− N ′ eN ṫ2 + P ′ eP ṙ 2 + r θ̇ 2 + r sin2 θ φ̇2 −
2
2
ds
1
1
− N ′ eN ṫ2 + P ′ eP ṙ 2 + r θ̇ 2 + r sin2 θ φ̇2 − eP P ′ ṙ 2 − eP r̈ = 0
2
2
1
1 ′ 2
P
′ N −P 2
−P
2
−P
2
2
−e
r̈ + N e
ṫ + P ṙ − e r θ̇ − e r sin θ φ̇
= 0
2
2
2
2
dθ
dφ
d2 r 1 ′ N −P dt 2 1 ′ dr 2
−P
−P
2
+ Ne
+ P
−e r
− e r sin θ
= 0. (400)
2
ds
2
ds
2
ds
ds
ds
After comparing it to eq. (396), we obtain
2
2
2
2
d2 r
dr
dθ
dφ
dt
1
1
1
1
+ Γ11
+ Γ22
+ Γ33
= 0,
+ Γ00
2
ds
ds
ds
ds
ds
(401)
which means that
1 ′ N −P
Ne
,
2
1 ′
=
P,
2
= −e−P r,
Γ100 =
Γ111
Γ122
Γ133 = −e−P r sin2 θ,
(402)
while other Γ1αβ symbols vanish.
• θ-component:
∂L
d ∂L
−
∂θ
ds ∂ θ̇
1 2
d 2 r θ̇
r 2 sin θ cos θ φ̇2 −
2
ds
1 2
r sin 2θ φ̇2 − 2r ṙθ̇ − r 2 θ̈
2
ṙ
1
2
2
−r θ̈ − sin 2θ φ̇ + 2 θ̇
2
r
2
2
d θ 2 dr
dφ
dθ
1
+
− sin 2θ
2
ds
r ds
ds
2
ds
= 0
= 0
= 0
= 0
= 0
After comparing it to eq. (396), we obtain
2
dr
dθ
dφ
d2 θ
2
2
2
+ Γ12 + Γ21
= 0,
+ Γ33
2
ds
ds
ds
ds
(403)
(404)
which means that
1
Γ212 = Γ221 = ,
r
1
Γ233 = − sin 2θ,
2
(405)
113
PHYS 652: Astrophysics
114
while other Γ2αβ symbols vanish.
• φ-component:
d ∂L
∂L
−
∂φ ds ∂ φ̇
d 2 2 r sin θ φ̇
0−
ds
−2r ṙ sin2 θ φ̇ − 2r 2 sin θ cos θ θ̇φ̇ − r 2 sin2 θ φ̈
ṙ
cos θ
2
2
−r sin θ φ̈ + 2 φ̇ + 2
θ̇φ̇
r
sin θ
d2 φ 2 dr
dφ
dθ
dφ
+
+ 2 cot θ
2
ds
r ds
ds
ds
ds
= 0
= 0
= 0
= 0
= 0
After comparing it to eq. (396), we obtain
dr
dθ
dφ
dφ
d2 φ
3
3
3
3
+ Γ13 + Γ31
+ Γ23 + Γ32
= 0,
2
ds
ds
ds
ds
ds
(406)
(407)
which means that
1
,
r
= cot θ,
Γ313 = Γ331 =
Γ323 = Γ332
(408)
while other Γ3αβ symbols vanish.
These Christoffel symbols associated with the metric given in eq. (393) are needed to compute
the Riemann tensor, which, in turn, is used to compute the Ricci tensor and Ricci scalar, to fully
determine the LHS of the Einstein’s equation: Gαβ ≡ Rαβ − 21 gαβ R = 0.
It can be shown (Homework set #3) that Gαβ = 0 leads to
eN −P
1
eN
−
P′ −
− − 2 = 0,
r
r
r
′
N
1
−
− 2 1 − eP
= 0,
r
r
2 N ′ − P ′
1
1
1
N′ +
= 0.
(409)
− r 2 e−P N ′′ − P ′ N ′ +
2
2
2
r
These expressions combine to give (Homework set #3) to obtain
dN
1
dP
=−
=
1 − eP ,
dr
dr
r
which can be solved for P :
Z
Z
dP
dr
=
1 − eP
r
Z P
P
1−e
e
+
dP
P
1−e
1 − eP
eP
P − ln 1 − eP = ln eP − ln 1 − eP = ln
1 − eP
P
e
Cr
= Cr =⇒ eP =
P
1−e
1 + Cr
114
= ln Cr
= ln Cr
(410)
PHYS 652: Astrophysics
115
Solving for N we obtain
N = −P + const. =⇒ eN = econst e−P ,
(411)
but since we have to recover Minkowski metric at large distances:
lim g00 → −1,
r→∞
lim g11 → 1,
(412)
r→∞
and const. = 0. Therefore,
N
= −P
N
g00 = −e
−P
= −e
1
=−
=−
g11
1 + Cr
Cr
1
=− 1+
Cr
.
For weak gravitational fields, we derived in eq. (39) including the constants:
c2
2GM
1
2Φ
=⇒
C
=
−
.
=
−
g00 = − 1 + 2 = − 1 −
c
rc2
g11
2GM
(413)
(414)
We finally arrive at the solution to the Schwarzschild problem, and the corresponding line element
in the Schwarzschild metric (with constants c and G included explicitly):
dr 2
2GM
2 2
2
c
dt
+
+ r 2 dθ 2 + r 2 sin2 θdφ2 .
(415)
ds = − 1 −
rc2
1 − 2GM
rc2
Birkhoff ’s theorem. The derivation of the Schwarzschild metric does not require any other
information about the distribution of the matter giving rise to the gravitational field — it only
requires that it is:
• spherically symmetric;
• that it has zero density at the radius of interest.
Birkhoff showed that any spherically symmetric vacuum solution of Einstein’s field equations must
also be static and agree with Schwarzschild’s solution. Therefore, the spherically symmetric mass
leads to the Schwarzschild metric regardless of whether the mass is static, collapsing, expanding
or pulsating. This, of course, refers to the field outside the mass, as first stated in the derivation,
because we start with Tαβ = 0. Two of the most important features of Newtonian gravity therefore
apply to GR:
• the gravity of a spherical body appears to act from a central point mass;
• the gravitational field inside a spherical shell vanishes.
Schwarzschild Radius, Event Horizon and Black Holes
The Schwarzschild space-time metric has a singularity when the denominator in the second
term is equal to zero:
2GM
(416)
1 − 2 = 0,
c r
115
PHYS 652: Astrophysics
116
which happens when the radius associated with mass M is
rs =
2GM
.
c2
(417)
This is called the Schwarzschild radius, or the event horizon, because events occurring inside it
cannot propagate light signals to the outside. Any body which is small enough to exist within
its own event horizon is therefore disconnected from the rest of the Universe: its only physical
manifestation is through its (infinitely) deep gravitational potential well, which is what led to the
adoption of the term black hole in the late 1960’s.
For a body with mass equal to that of our Sun, the event horizon is equal to
2 6.67 × 10−8 2 × 1033
2GM⊙
≈ 3 × 105 cm = 3 km.
(418)
=
rs =
c2
(3 × 1010 )2
We can write the proper time in the Schwarzschild metric as
2GM
dr 2
2
2
2
− r 2 dθ 2 − r 2 sin2 θdφ2 ,
ds = −dτ
=⇒
dτ = 1 − 2
dt2 −
c r
1 − 2GM
2
c r
(419)
where dt is the time interval according to an observer at r → ∞, and dτ is the time interval
measured by a local observer (in comoving coordinates, in which the Universe is static). Because
for the local observer the Universe is static, it means that dr = 0, so
dt2 =
dτ 2
.
1 − 2GM
c2 r
(420)
This is time dilation: while the local observer near the black hole (at r & rs ) sees nothing unusual
about her/his time-measurements (dτ ), the measurements of the observer at r → ∞ would suggest
−1/2
that the local observer’s clock runs slow by a factor 1 − 2GM
. It becomes infinitely slow at
c2 r
the event horizon rs . Therefore, the inertial observer (at infinity) can never witness the infalling
observer reach the event horizon.
Orbits in Schwarzschild’s Geometry
For the dynamics of black holes and their accretion disks, it is important to quantify the motion
of particles which find themselves near the black hole. We now present a brief exposition of the
orbit theory near a black hole.
In order to compute orbits in Schwarzschild’s geometry, we need to first compute the equations
of motion.
Combining components of the solutions to Einstein’s equation in Schwarzschild’s metric which
we just derived with the general property of massive particles in a metric
gαβ
dxα dxβ
= 1,
ds ds
(421)
(= 0 for photons), it can be shown that the motion near the black hole can be described with
2
rs Ā2 dr
2
1−
= B̄ − 1 − 2
dτ
r
r
dφ
Ā
=
,
(422)
dτ
r2
116
PHYS 652: Astrophysics
117
where Ā is the angular momentum per unit mass and B̄ 2 is the energy per unit mass relative to
infinity.
We now define a relativistic potential
Ā2 rs V (r) ≡ 1 + 2
(423)
1−
r
r
so that
dr
dτ
2
= B̄ 2 − V (r).
(424)
The shape of the potential is given in Fig. 39. The two minima of the potential (r/rs )± are found
Figure 39: Relativistic potential V (r).
by solving:
dV
dr
rs
2Ā2 3rs Ā2
−
+
r2
r3
r4
2 2
2
Ā
r
Ā
r
−2
= 0.
+3
=⇒
rs
rs
rs
rs


v
2
u
3
Ā
r


u
=
=⇒
1 ± t1 − 2 , 
rs ±
rs
Ā
=
rs
so there are no circular orbits if
Ā
rs
<
√
3.
117
(425)
PHYS 652: Astrophysics
22
118
Lecture 22: Degeneracy of Matter
“Physics is very muddled again at the moment; it is much too hard for me anyway, and I wish I
were a movie comedian or something like that and had never heard anything about physics!”
Wolfgang Pauli
The Big Picture: Last time we derived the Schwarzschild metric corresponding to an isolated
mass, which led to the the introduction of black holes and even horizons. Today we introduce
degenerate matter, such as the matter in white dwarfs and neutron stars. We also introduce
polytropes as simple equilibrium stellar models.
Degeneracy
According to Pauli’s Exclusion Principle, no two fermions (particles with spin of one half) can
occupy the same quantum state. This is equivalent to requiring that the volume per fermion be
proportional to λ3c ∼ (~/mc)3 , where m is the fermion’s mass and λc is its Compton wavelength.
The average number density of the fermions is therefore nf ∼ λ−3
c . In white dwarfs the density is
nf times the mass per electron, and in neutron stars it is the nucleon mass times nf .
We can use this argument to compute the relative densities of white dwarfs, which are supported
by electron degeneracy, and neutron stars, supported by neutron degeneracy to obtain (with approximation that the mass per electron is on the order of magnitude of the mass of the nucleon):
λ−3
m3n
mn 3
ρns
n,e
= 3 =
=
≈ (2000)3 = 8 × 109 .
(426)
ρwd
m
m
λ−3
e
c,e
e
In a gas of very high fermion density, the lower momentum states are filled, so fermions must
then occupy states of higher momentum. These high-momentum fermions make a large contribution
to the pressure, and the gas is said to be (partially) “degenerate”.
Complete Degeneracy
If the fermion density is large enough, then essentially all available states having energies E <
ǫf (where ǫf is the Fermi energy, defined as the energy of the highest occupied quantum state
in a system of fermions at absolute zero temperature). As the gas temperature is lowered, the
distribution function
1
f (p) = [E(p)−µ]/T
,
(427)
e
+1
approaches unity for particle energies E . µ, and zero for E & µ, where µ is the chemical potential.
For T = 0, µ ≡ ǫf , so the distribution function becomes a step function:
1 if ǫf ≥ E(p),
f (p) = θ(ǫf − E(p)) =
(428)
0 if ǫf < E(p).
The number density of fermions corresponding to the distribution function above is
Z pf
Z ∞
Z
4πp2 dp
8π
d3 p
8π pf 2
8π 1
f (p)
=
2
=
n=g
p dp = 3 p3f = 3 p3f ,
3
3
3
(2π~)
(2π~)
h 0
h 3
3h
0
0
so
pf =
3h3
n
8π
118
1/3
.
(429)
(430)
PHYS 652: Astrophysics
119
pf is the Fermi momentum corresponding to the Fermi energy:
ǫf =
The energy density is given by
Z
ρe = g
p2f
2m
∞
=⇒
E(p)f (p)
0
pf =
d3 p
8π
= 3
3
h
h
Z
p
2mǫf .
pf
(431)
E(p)p2 dp,
(432)
0
where E(p) is the kinetic energy per fermion.
Nonrelativistic (complete) degeneracy. When the fermions are nonrelativistic, so p = mv
and E(p) = p2 /2m. The energy density then is
ρ =
=⇒
ρ =
Z
Z pf
2
8π pf p2 2
8π
8π 3 pf
8π 3h3
8π 1 5
4
p dp =
p = 3 pf
= 3
n ǫf
p dp =
h3 0 2m
2mh3 0
2mh3 5 f
5h
2m
5h
8π
3
nǫf .
(433)
5
For nonrelativistic particles
2
23
2 pf
n 2
n
2
nǫf = n
=
pf =
P = ρ=
3
35
5 2m
5m
5m
3h3
n
8π
2/3
h2
=
20m
2/3
3
n5/3 .
π
(434)
The equation above is the equation of state for a nonrelativistic, completely degenerate fermion
gas.
Extremeprelativistic (complete) degeneracy. When the fermions are relativistic, p ≫ mc and
E(p) = c p2 + m2 c2 ≈ cp. The energy density then is
Z
Z
2π 3
2π 3h3
8πc pf 3
8πc 1 4
8π pf
2
n ǫf .
cpp dp = 3
p dp = 3 pf = 3 pf (cpf ) = 3
ρ =
h3 0
h
h 4
h
h
8π
0
3
=⇒ ρ =
nǫf .
(435)
4
For relativistic particles (recall Homework set #1):
3 1/3
1
hc 3 1/3 4/3
3h
13
1
1
P = ρ=
=
nǫf = ncpf = nc
n
n .
3
34
4
4
8π
8 π
(436)
The equation above is the equation of state for an extreme relativistic, completely degenerate
fermion gas.
Important point: For complete or nearly complete degeneracy, the pressure P is independent of
the temperature T .
Onset of Degeneracy
We now estimate the thresholds for the onset of the complete nonrelativistic degeneracy and
complete relativistic degeneracy.
119
PHYS 652: Astrophysics
120
• From nondegeneracy to complete nonrelativistic degeneracy.
Let us first see under which conditions will a star end up in complete nonrelativistic degeneracy. This will happen when the pressure due to the thermal equilibrium of the particles is
balanced by the pressure due to the nonrelativistic degeneracy of electrons.
Combining the equation of state for the ideal gas
P =
ρkT
µ̄mH
(437)
and the eq. (434), we obtain
h2
ρkT
=
µ̄mH
20me
2/3
3
n5/3
e
π
(438)
where µ̄ is the mean molecular weight, defined as
1 X n̄i mH
=
,
µ̄
mi
(439)
i
mH is the mass of the hydrogen atom, and n̄i = ρi /ρ is the abundance of species by weight.
The number density ne of electrons is given in terms of the density as
ρ
.
(440)
ne =
mH µ̄e
Taking µ̄ = µ̄e ≈ 1, the eq. (438) becomes
ρkT
ρ 5/3
h2
≈
mH
20me mH
20me k 3/2 3/2
T
=⇒ ρ = mH
h2
ρ =
1.67 × 10−24 g
−8
ρ ≈ 10
T
3/2
.
20 9.11 × 10−28 g
1.38 × 10−16
erg 2
6.63 × 10−27
s
erg K
!3/2
T 3/2
(441)
Therefore
ρ > 10−8 T 3/2 ,
(442)
is the requirement for the electron gas to be completely degenerate.
• From nonrelativistic degeneracy to extreme relativistic degeneracy.
In the case of relativistic particles pf ≫ me c, but the “transition” occurs at, say, pf = 2me c:
3 1/3 3
1/3
3h
3h
ρ
pf =
=
= 2me c
take µ̄e = 1
n
8π
8π mH µ̄e
64πmH (me c)3
=⇒ ρ ≈
3h3
3
9.11 × 10−28 g 3 × 1010 cm
64π 1.67 × 10−24 g
s
=
3
(6.63 × 10−27 erg s)3
g
=⇒ ρ ≈ 107
.
(443)
cm3
120
PHYS 652: Astrophysics
121
Therefore
g
.
cm3
is the requirement for the gas of electrons to reach extreme relativistic degeneracy.
ρ > 107
(444)
These degenerate forms of matter describe brown dwarfs, white dwarfs (electron degeneracy)
and neutron stars (neutron degeneracy), which we discussed in Lecture 11.
Figure 40: Simple model of a star: a sphere of gas in hydrostatic equilibrium.
Hydrostatic Equilibrium
We now present a simple model for a star in hydrostatic equilibrium.
Consider a think shell within a star in equilibrium. There are inward force acting on the shell
due to its gravitating mass and the outward force of gas pressure:
M (r) ρ(r)4πr 2 dr
Fg = −G
r2
2
Fp = 4πr [P (r + dr) − P (r)] = 4πr 2 dP
(445)
where M (r) is mass interior to the shell:
M (r) = 4π
Z
r
ρ(r̃)r̃ 2 dr̃.
(446)
0
In hydrostatic equilibrium, these two forces are balanced, so
Fp = Fg
4πr 2 dP
=⇒
dP
dr
M (r) ρ(r)4πr 2 dr
= −G
r2
GM (r)
= −ρ(r)
.
r2
The equation above is the equation of hydrostatic equilibrium.
121
(447)
PHYS 652: Astrophysics
122
Isothermal Atmospheres in Hydrostatic Equilibrium
Stellar atmospheres are usually thin when compared to the stellar radius, which allows us to
approximate the force due to gravity as a constant throughout the atmosphere:
g≡
GM
≈ const.
R2
(448)
Let h be the height of the atmosphere (r-derivative can be replaced with an h-derivative). Then
the equation of hydrostatic equilibrium [eq. (447)] then becomes
dP
= −ρg.
dh
(449)
But from the equation of state for ideal gas [eq. 437]:
P =
ρkT
µ̄mH
=⇒
ρ=
µ̄mH
P,
kT
(450)
so the eq. (449) becomes
dP
µ̄mH g
=−
P.
dh
kT
If we define the “e-folding height” (“scale height”) of the atmosphere as
H≡
(451)
kT
,
µ̄mH g
(452)
and define the initial condition P (0) = P0 , we can rewrite the eq. (449) and integrate it to obtain
P
dP
dh
=⇒
=−
H
P
H
h
=⇒
P (h) = Ce−h/H
log P = − + c
H
P (h) = P0 e−h/H .
dP
dh
=⇒
= −
but P (0) = P0
(453)
Important point: the equation of hydrostatic equilibrium must be accompanied by an equation
of state.
Polytropes
Polytropes are a family of equations of state for which the pressure P is given as a power of
density ρ. A gas governed by a polytropic process has the equation of state
P V γ = const.
Since ρ = M/V , where M is the mass of gas contained in volume V , we have
−γ
M
−γ
P ∝ V
∝
,
ρ
=⇒
P = κργ ,
κ = const.
(454)
(455)
Gas obeying an equation of state of this form is called a polytrope. Examples of polytropes are
given in Table 7.
122
PHYS 652: Astrophysics
123
Table 7: Examples of polytropic gases.
Type of polytropic gas
nonrelativistic, completely degenerate gas
extreme relativistic completely degenerate gas
isothermal gas
gas and radiation pressure
γ
5/3
4/3
1
4/3
Eddington standard model. The polytrope with γ = 4/3 is a simple model of a star supported
by both radiation pressure
Pr =
1 π2 4 π2 4 1 4
1
ργ =
T =
T ≡ aT ,
3
3 15
45
3
(456)
ρkT
.
µ̄mH
(457)
and ideal gas pressure:
Pg =
Now introduce the constant β quantifying the relative contribution of gassy pressure to the total
pressure (both gas and radiation) (P = Pr + Pg ):
Pg = βP,
=⇒
=⇒
β=
Pg
,
P
Pr = (1 − β)P,
(458)
so that
3(1 − β)
1
=⇒
T4 =
P,
Pr = (1 − β)P = aT 4
3
a
Next, we eliminate the temperature T in from the equation of state:
4
β P
4
P3
=⇒
=⇒
P
ρk 4 3(1 − β)
ρk 4 4
P
T =
=
=
µ̄mH
µ̄mH
a
4
k
3(1 − β) 4
=
ρ
µ̄mH
aβ 4
4/3 k
3(1 − β) 1/3 4/3
=
ρ .
µ̄mH
aβ 4
Pg4
(459)
(460)
The term multiplying ρ4/3 in the equation above is constant if β is constant (the relative breakdown
of radiation and gas pressure remains unchanged) and µ̄ is constant (composition of gas does not
change). If this is indeed the case, then we have the Eddington standard model
P = κρ
4/3
,
κ≡
k
µ̄mH
4/3 3(1 − β)
aβ 4
1/3
.
(461)
This model is a special case of Lane-Emden equations governing the polytropes in hydrostatic
equilibrium which we will discuss next time.
123
PHYS 652: Astrophysics
23
124
Lecture 23: The Lane-Emden Equation
“Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones
is not a house and a collection of facts is not necessarily science.”
Henri Poincare
The Big Picture: Today we discuss the Lane-Emden equation, which describes polytropes in
hydrostatic equilibrium as simple models of a star. We also derive the Chandrasekhar limit for the
formation of a black hole.
The Lane-Emden Equation
Last time we introduced the polytropes as a family of equations of state for gas in hydrostatic
equilibrium. They are given by the equation of state in which the pressure is given as a power-law
in density:
P = κργ ,
(462)
where κ and γ are constants. The Lane-Emden equation combines the above equation of state for
polytropes and the equation of hydrostatic equilibrium
dP
GM (r)
= −ρ(r)
.
dr
r2
(463)
If we solve for the equation above for M (r)
r 2 dP
M (r) = −
ρG dr
dM
1 d
=−
dr
G dr
=⇒
r 2 dP
ρ dr
,
(464)
and compare it to what we obtain from considering the spherical shell in hydrostatic equilibrium
dM = 4πr 2 ρdr
=⇒
dM
= 4πr 2 ρ,
dr
(465)
we obtain
dM
1 d r 2 dP
=−
= 4πr 2 ρ,
dr
G dr ρ dr
1 d r 2 dP
= −4πGρ.
r 2 dr ρ dr
After inserting the polytropic equation of state [eq. (462)], the equation above becomes
1 d r2
γ−1 dρ
κγρ
= −4πGρ.
r 2 dr ρ
dr
(466)
(467)
After defining quantities
ρ ≡ λθ n ,
n+1
,
γ ≡
n
124
(468)
PHYS 652: Astrophysics
125
the eq. (467) becomes
n
1 d κr 2 n + 1
n 1/n d (λθ )
(λθ )
= −4πGλθ n
r 2 dr λθ n n
dr
n + 1 1−n 1 d
2 dθ
n
κλ
r
= −θ n .
4πG
r 2 dr
dr
(469)
We now make this equation dimensionless by introducing a radial variable ξ
r
,
α
r
n + 1 1−n
α ≡
κλ n ,
4πG
ξ ≡
(470)
to finally obtain the Lane-Emden equation for polytropes in hydrostatic equilibrium:
d
2 dθ
2 1
(αξ)
= −θ n
α
(αξ)2 d(αξ)
d(αξ)
1 d
2 dθ
=⇒
ξ
= −θ n
ξ 2 dξ
dξ
(471)
This is a second order ordinary differential equation, which means that it requires two boundary
conditions in order to be well-defined:
1. Define the central density ρc ≡ λ. Then
ρ = λθ n
2. At r = 0,
Therefore,
dP
dr
=⇒
θ(0) = 1.
(472)
= −ρg = −ρc g = 0, because gc = 0 (there is no mass inside zero radius).
dP
dρ
dθ
= κγργ−1
∝
dr
dr
dξ
=⇒
dθ = 0.
dξ ξ=0
(473)
Analytic Solutions of the Lane-Emden Equation
The Lane-Emden equation can be analytically solved only for a few special, integer values of
the index n: 0, 1 and 5. For all other values of n, we must resort to numerical solutions. However,
it is beneficial from both pedagogical and intuitive standpoint to derive these analytical solutions,
which is what we do next.
Analytic solution for n=0.
After substituting n = 0 into the Lane-Emden equation [eq. (471)], we obtain
Z
Z
d
1 d
2 dθ
2 dθ
ξ
=
−1
=⇒
ξ
dξ
=
−
ξ 2 dξ
ξ 2 dξ
dξ
dξ
dξ
1
dθ
1
c1
dθ
= − ξ 3 + c1
=⇒
= − ξ + 2.
=⇒ ξ 2
dξ
3
dξ
3
ξ
125
(474)
PHYS 652: Astrophysics
126
But, using the boundary conditions, we obtain
dθ =0
=⇒
c1 = 0
=⇒
dξ ξ=0
=⇒
θ(0) = 1
=⇒
c2 = 1
=⇒
dθ
1
=− ξ
=⇒
dξ
3
1
θ0 = 1 − ξ 2 .
6
From the equation above, we see that this configuration has a boundary at ξ =
Analytic solution for n=1.
After substituting n = 1 into the Lane-Emden equation [eq. (471)], we obtain
1 d
d
2 dθ
2 dθ
ξ
= −θ
=⇒
ξ
= −ξ 2 θ.
ξ 2 dξ
dξ
dξ
dξ
Introduce the variable χ
χ(ξ) ≡ ξθ(ξ)
Then
d
dθ
=
dξ
dξ
=⇒
θ≡
1
θ = − ξ 2 + c2
6
(475)
√
6, where θ0 → 0.
χ
.
ξ
χ
ξχ′ − χ
,
=
ξ
ξ2
and the Lane-Emden equation in eq. (476) becomes
d
d
2 dθ
ξχ′ = χ′ + ξχ′′ − χ′ = ξχ′′
ξ
=
dξ
dξ
dξ
′′
χ
ξχ
=−
=⇒
χ′′ = −χ
=⇒
χ′′ + χ = 0.
=⇒
2
ξ
ξ
(476)
(477)
(478)
(479)
This is a harmonic oscillator with general solutions
χ(ξ) = A sin ξ + B cos ξ,
(480)
sin ξ
cos ξ
+B
,
ξ
ξ
(481)
or, in terms of θ ≡ χ/ξ
θ(ξ) = A
After imposing the first boundary condition, the general solution is obtained:
θ(0) = 1
=⇒
B = 0,
A = 1,
=⇒
θ1 (ξ) =
The second boundary condition
rule
dθ dξ ξ=0
cos ξ
=∞
ξ→0 ξ
sin ξ
= 1.
because lim
ξ→0 ξ
because lim
sin ξ
.
ξ
(482)
= 0 is explicitly satisfied, because, after applying L’Hospital’s
ξ cos ξ − sin ξ
−ξ sin ξ + cos ξ − cos ξ
1
= − lim sin ξ = 0,
= lim
2
ξ→0
ξ→0
ξ
2ξ
2 ξ→0
lim
(483)
as required. From the eq. (482) above, we see that this configuration is has a boundary at ξ = π,
where θ1 → 0.
126
PHYS 652: Astrophysics
127
Analytic solutions of the Lane-Emden equation
1
ρ/λ
0.8
0.6
0.4
0.2
n=0
n=1
n=5
0
0
1
2
61/2
3
π
r/α
Figure 41: Analytic solutions for the Lane-Emden equation with n = 0, 1, 5.
Analytic solution for n=5.
The solution of Lane-Emden equation with n = 5 is analytically tractable, yet quite complicated
to integrate. The solution is
1
θ5 (ξ) = q
.
(484)
1 + 13 ξ 2
This configuration is unbounded: ξ ∈ [0, ∞), and limξ→∞ θ5 = 0.
[For explicit derivation, see S. Chandrasekhar’s An Introduction to the Study of Stellar Structure
(University of Chicago Press, Chicago, 1939), p. 93-94]
The Chandrasekhar Mass Limit
Consider a star which has, through gravitational contraction, become so dense that it is supported by a completely degenerate, extreme relativistic electron gas (i.e, ρ > 107 g cm−3 ). The
pressure in terms of the density is obtained by combining the eq. (436)
hc
P =
8
1/3
3
n4/3
π
and
n=
ρ
,
mH µ̄
127
(485)
(486)
PHYS 652: Astrophysics
128
to obtain
P
=⇒
P
4/3
1/3 3
ρ
=
π
mH µ̄e
−27
6.63 × 10
erg s 3 × 1010
=
8
4/3
ρ
= 1.24 × 1015
,
µ̄e
hc
8
cm
s
1/3
4/3
ρ
3
1
4/3
π
µ̄e
(1.67 × 10−24 g)
(487)
which is an equation of state for a polytrope with γ = 4/3 and κ =
1.24×1015
.
4/3
µ̄e
Corresponding value
1
of the index n = γ−1
is n = 3.
The mass corresponding to this polytropic configuration can be computed as follows:
Z ξmax
Z rmax
Z rmax
2
3
λθ 3 (αξ)3 d(αξ)
λρ(r)r dr = 4π
ρ(r)d r = 4π
M3 =
0
0
0
Z ξmax d
3
2 dθ
= 4πλα
−
ξ
dξ
dξ
dξ
0
dθ
= 4πλα3 −ξ 2
,
dξ ξmax
(488)
where we have used the Lane-Emden equation in eq. (471). The constant λ is defined in eq. (470),
and for n = 3 is
r
r
−2
n + 1 1−n
κ
=⇒
α=
κλ n
κλ 3
α =
4πG
πG
h κ −2 i3/2 h κ i3/2
=⇒ λα3 = λ
=
.
(489)
λ3
πG
πG
The term in brackets can be evaluated numerically (Table 4.2 of Astrophysics I: Stars by Bowers
& Deeming) to about 2.02, so the total mass is
3/2

1.24×1015
3/2
4/3
1.24 × 1015
2.02
µ̄e
 2.02 = 4π
M3 = 4π 
π(6.67 × 10−8 )
π(6.67 × 10−8 )
µ̄2e
=
=⇒
M3 =
1.16 × 1034
1.16 × 1034
M⊙
g
=
2
2
µ̄e
µ̄e
1.99 × 1033
5.81
M⊙ .
µ̄2e
(490)
Let us now compute µ̄e for a star with relativistic matter degeneracy. In such a star, it is convenient
to define the matter density, due essentially to the ions, as ρ = mH µe ne . Also, let us consider
contribution from hydrogen (subscript H), helium (He) and elements with atomic weight greater
then 4 (Z). Then, from the definition in eq. (439), we have
X mH
mH X e mH ρeH ρeHe ρeZ
1
e
n̄i =
=
n̄ =
+
+
µ̄e
me i
me
me ρ
ρ
ρ
i
i
mH
2 mHe mHe nHe
mH n H
+
ρ
ρ
1
1
≡ X + Y + Z.
2
2
=
+
A mH
2 mZ mZ nZ
ρ
=
ρH 2 ρHe
A ρZ
+
+
ρ
4 ρ
2A ρ
(491)
128
PHYS 652: Astrophysics
129
Also, conservation of mass imposes that
X +Y +Z =1
=⇒
Z = 1−X −Y
(492)
so
=⇒
1
µ̄e
1
µ̄e
1
1
1
1+X
1
= X + Y + (1 − X − Y ) = X + =
2
2
2
2
2
1+X
2
=
=⇒
µ̄e =
.
2
1+X
(493)
The stars that are undergoing extreme relativistic degeneracy of matter are highly evolved (near
the end of their life-cycle), which means that it is reasonable to assume that most of their hydrogen
fuel has been burned up, so
X≈0
=⇒
µ̄e ≈ 2.
(494)
Finally, we combine this result with the eq. (490) to obtain the Chandrasekhar mass limit:
MCh =
5.81
5.81
M⊙ = 2 M⊙
µ̄2e
2
=⇒
MCh = 1.45M⊙ .
(495)
When a star runs out of fuel, it will explode into a supernova or a helium flash (see Fig. 16). The
Schwarzschild mass limit implies that star remnants with mass M > MCh cannot be supported by
electron degeneracy and therefore will collapse further into a neutron star or a black hole.
129
PHYS 652: Astrophysics
24
130
Lecture 24: Galaxies: Classification and Treatment
“The effort to understand the Universe is one of the very few things that lifts human life a little
above the level of farce, and gives it some of the grace of tragedy.”
Steven Weinberg
The Big Picture: Today we define and classify galaxies and outline their main characteristics.
We also justify the mean-field approximation in galaxy modeling.
The Hubble Classification of Galaxies
Galaxies are found in a wide range of shapes, sizes and masses, but can be divided into four
main types according to Hubble classification (see Fig. 42).
Figure 42: The Hubble classification of galaxies.
Galaxies near the start of the sequence (early-type galaxies) have little or no cool gas and dust,
and consist mostly of old Population II stars (old, less luminous and cooler than Population I stars;
have fewer heavy elements — “metal-poor”); galaxies near the end (late-type galaxies) are rich in
gas, dust, and young stars.
130
PHYS 652: Astrophysics
131
Elliptical Galaxies
Elliptical galaxies are smooth, featureless systems containing little or no gas or dust. The
fraction of bright galaxies that are elliptical is a function of the local density, ranging from about
10% in low-density regions to 40% in dense clusters of galaxies. The isophotes (contours of constant
surface brightness) are approximately concentric ellipses, with axis ratio b/a ranging from 1 to about
0.3. Elliptical galaxies are denoted by the symbols E0, E1, etc., where the brightest isophotes of a
galaxy of type En have axis ratio b/a = 1 − n/10. The ellipticity is ǫ = 1 − b/a. Thus the most
elongated elliptical galaxies are of type E7. Since we see only the projected brightness distribution,
it is impossible to determine directly whether elliptical galaxies are axisymmetric or triaxial.
Surface brightness profiles.
The surface brightness of an elliptical galaxy falls off smoothly with radius. Often the outermost
parts of a galaxy are undetectable against the background night-sky brightness. The surfacebrightness profiles of most elliptical galaxies can be fit reasonably well by the empirically-motivated
R1/4 or de Vaucouleurs’ law
1/4
(496)
I(R) = I(0)e−kR ,
where the effective radius Re is the radius of the isophote containing half of total luminosity and
Ie is the surface brightness Re . The effective radius is typically 3/h kpc for bright ellipticals and is
smaller for fainter galaxies.
However, it has been shown that de Vaucouleurs’ R1/4 law is appropriate only for a subset
of elliptical galaxies. Generalizing de Vaucouleurs’ law to allow for a varying rate of exponential
decay, we arrive at the Sérsic law (of which de Vaucouleurs’ is a special case when n = 4):
1/n
I(R) = I(0)e−kR
.
(497)
It has been shown that there exists a strong correlation between the observed size of the elliptical
galaxy and the best-fit index n: heavier elliptical galaxies have higher values of n.
Central density cusps and supermassive black holes.
With the advent of the Hubble Space Telescope, modeling of elliptical galaxies has undergone a
revolution: elliptical galaxies are not well-approximated by density profiles with central cores, as
once thought, but have logarithmic slopes of the density profiles which increase all the way to
the smallest observable radius: the elliptical galaxies have central density cusps. Furthermore,
the centers of most elliptical galaxies harbor a supermassive black hole, with mass millions (and
sometimes billions) times that of our Sun.
No net rotation.
Most giant elliptical galaxies exhibit little or no rotation, even those with highly elongated isophotes.
Their stars have random velocities along the line of sight whose root mean square dispersion σp can
be measured from the Döppler broadening of spectral lines. The velocity dispersion in the inner
few kiloparsecs is correlated with luminosity according to the Faber-Jackson law
σp ≃ 220(L/L⋆ )1/4 km s−1 .
(498)
Lenticular Galaxies
Lenticular galaxies have a prominent disk that contains no gas, bright young stars, or spiral
arms. Lenticular disks are smooth and featureless, like elliptical galaxies, but obey the exponential
131
PHYS 652: Astrophysics
132
surface-brightness law characteristic of spiral galaxies:
I(R) = I(0)e−R/Rd ,
(499)
where the disc scale length Rd = 3.5 ± 0.5 kpc. Lenticulars are labeled by the notation S0 in
Hubble’s classification scheme. They are very rare in low-density regions, comprising less then 10%
of all bright galaxies, but up to half of all galaxies in high-density regions are S0’s.
The lenticulars form a transition class between elliptical and spirals. The transition is smooth
and continuous, so that there are S0 galaxies that might well be classified as E7, and others that
sometimes been classified as spirals.
The strong dependence of the fractional abundance of the fractional abundance of S0 galaxies
on the local density is obviously an important — but still controversial — clue to the mechanism
of galaxy formation.
Spiral Galaxies
Spiral galaxies, like the Milky Way, contain a prominent disk composed of gas, dust and Population I stars (Population I stars include the Sun and tend to be luminous, hot and young, concentrated in the disks of spiral galaxies, and particularly found in the spiral arms). In all these systems
the disk contains spiral arms, filaments of bright stars, gas, and dust, in which large numbers of
stars are currently forming. The spiral arms vary greatly in their length and prominence from one
spiral galaxy to another but are almost always present.
In low-density regions of the Universe, almost 80% of all bright galaxies are spirals, but the
fraction drops to 10% in dense regions such as cluster cores.
The distribution of surface brightness in spiral galaxy disks obeys the exponential law. The
typical disk scale length is Rd ≃ 3/h kpc, and the central surface brightness is remarkably constant
at I0 ≃ 140L⊙ pc−2 .
The circular-speed curves of most spiral galaxies are nearly flat, vc (R) independent of R, except
near the center, where the circular speed drops to zero. Typical circular speeds are between 200
and 300 km s−1 . It is a remarkable fact that the circular speed curves still remain flat even at radii
well beyond the outer edge of the visible galaxy, thus implying the presence of invisible or dark
mass in the outer parts of the galaxy.
Spiral galaxies also contain a spheroid of Population II stars. The luminosity of the spheroid
relative to the disk correlates well with a number of other properties of the galaxy, in particular
the fraction of the disk mass in gas, the color of the disk, and how tightly the spiral arms are
wound. This correlation is the basis of Hubble’s classification of spiral galaxies. Hubble divided
spiral galaxies into a sequence of four classes or types, called Sa, Sb, Sc, Sd. Along the sequence
Sa → Sd the relative luminosity of the spheroid decreases, the relative mass of gas increases, and
the spiral arms become more loosely wound. The spiral arms also become more clumpy, so that
individual patches of young stars and HII regions (a cloud of glowing gas and plasma, sometimes
several hundred light-years across, in which star formation is taking place) become visible. Our
galaxy appears to be intermediate between Sb and Sc, so its Hubble type is written as Sbc.
Irregular Galaxies
Any classification scheme has to contain an attic – a class into which objects that conform to
no particular pattern can be placed. Since the time of Hubble, nonconformist galaxies have been
dumped into the irregular class (denoted Irr). A minority of Irr galaxies are spiral or elliptical
galaxies that have been violently distorted by a recent encounter with a neighbor. However, the
132
PHYS 652: Astrophysics
133
majority of Irr galaxies are simply low-luminosity gas-rich systems. These galaxies are designated
Sm or Im.
Galaxies as Collisionless Systems
The mean-field approximation is an effective tool for studying the dynamics of many-body
systems when the collisions are rare (i.e., when the collisional time-scales are long compared to
the dynamical time of the system studied). When that is the case, the system is said to be
collisionless, and the collisionless Boltzmann equation can be used. We have already seen the
Boltzmann equation in the context of non-equilibrium reactions, where the RHS of the equation
represented the non-equilibrium term.
Let us first estimate the collisional relaxation rates for a general self-gravitating N-body system.
Then we will particularize the solution to the case of a typical galaxy, and see if a mean field
approximation is indeed warranted.
Collisional relaxation time in a general self-gravitating N-body system.
Consider a self-gravitating system, like a galaxy, of identical particles (stars). Consider a twoparticle encounter within the framework of the impulse approximation.
From the figure above
F⊥ =
Gm2 cos θ
Gm2 b
Gm2
=
=
h
2 i3/2
x 2 + b2
(x2 + b2 )3/2
b2 1 + xb
F⊥ = mv̇⊥ =
Gm2
h
b2 1 +
i3/2
vt 2
b
.
(500)
Therefore, the change imparted to v⊥ from one collision is (after making a substitution s ≡ vt/b):
Z
Z
2Gm
Gm ∞
Gm ∞
dt
ds
=
=
δv⊥ ≃
.
(501)
i
h
2
3/2
3/2
b
bv −∞ (1 + s2 )
bv
vt 2
−∞
1+ b
Note that (conceptually):
δv⊥ ∼
Gm 2b
∼ (impulsive force)×(duration of interaction).
b2 v
(502)
The time it takes a particle to cross the whole system is the “crossing time” τcr , so τcr ≃ 2R/v,
with R denoting the characteristic size (radius) of the system. The number of collisions this particle
133
PHYS 652: Astrophysics
134
encounters in one crossing is, in the range (b, b + db):
δnc ∼
# of particles
N
bdb
2πbdb ∼
2πbdb ∼ 2N 2 .
cross-sectional area
πR2
R
(503)
Therefore, the mean-square change in velocity as the particle “random-walks” through the system
(due to collisions) is
2
hδv⊥
i
2
≃ (δv⊥ ) δnc ≃
2Gm
bv
2
bdb
2N 2 ≃ 8N
R
Gm
Rv
2
db
.
b
To get the total change, integrate over all impact parameters:
Z
Gm 2 R db
Gm 2
R
2
∆v⊥ ≃ 8N
≃ 8N
ln
.
Rv
Rv
bmin
bmin b
(504)
(505)
This is the total effect of individual collisions in one crossing time.
From the virial theorem for a self-gravitating system 2T̄ = V̄ , where bars denote time-averages,
so the typical particle speed is
1
GN m
GN mm
2
2
mv ≃
=⇒
v2 ≃
.
(506)
2
R
R
We estimate bmin by presuming the virial theorem also applies, in some average sense, to a close
encounter (or, in other words, T is sufficiently larger than V so as to avoid forming a bound binary
system):
R
R
Gm
=⇒
≃N
=⇒
bmin ≃
(507)
v2 ≃
bmin
bmin
N
2 to grow to v 2 , at which point the particle has completely
The number of crossings needed for ∆v⊥
forgotten its initial conditions is
v2
1 Rv 2 1
1 N
0.1N
1 Rv 2 1
Rv 2
1
GN m 1
=
ncr ≡
=
=
≃
, (508)
≃
2
R
R 8N Gm
8
Gm
ln
N
8
Gm
ln
N
8
ln
N
ln N
∆v⊥
ln
bmin
134
PHYS 652: Astrophysics
135
and the corresponding relaxation time is
τR = ncr τcr ≃
0.1N
τcr ≫ τcr .
ln N
(509)
Let us now estimate the crossing time for the self-gravitating system τcr . Consider a particle
freely-falling along a diameter of a uniform-density sphere:
3
G 4π
4πG
GM (r)
3 r ρ
=−
=−
ρ r
r̈ = −
r2
r2
3
4πG
r̈ +
ρ r=0
=⇒
r̈ + ω 2 r = 0
3
r
4π
2π 2
3π
2
ω =
=⇒
τcr =
Gρ =
3
2τcr
4Gρ
1
(510)
=⇒
τcr ≃ √
Gρ
Therefore, estimated collisional relaxation time for a typical self-gravitating N-body system is
τR ≃
0.1N 1
√ .
ln N Gρ
(511)
Collisional relaxation time for a typical elliptical galaxy.
A typical elliptical galaxy contains about 1012 stars of typical mass of M⊙ , and has a radius of
about R ≈ 100 kpc, so
N
≃ 1012 ,
R ≃ 100 kpc ≃ 105 (3.26) light − years ≃ 105 (3.26) 3 × 108 ms−1 (π × 107 s)
≃ 3 × 1021 m
m ≃ M⊙ ≃ 2 × 1030 kg,
0.1
ln (1012 )
!1/2
3
1012 3 × 1021
5 × 1025
25
≃
5
×
10
s
≃
years
(6.7 × 10−11 ) (2 × 1030 )
3 × 107
=⇒
τR ≃
=⇒
τR ≃ 1018 years ∼ 108 tHubble .
(512)
The relaxation time due to collisions is orders of magnitude longer than the age of the Universe,
which means that galaxies are well-approximated by collisionless, mean-field approximation and
the collisionless Boltzmann equation.
135
PHYS 652: Astrophysics
25
136
Lecture 25: Galaxies: Analytic Models
“Science is simply common sense at its best that is, rigidly accurate in observation, and merciless
to fallacy in logic.”
Thomas Henry Huxley
The Big Picture: Last time we showed that individual stellar encounters are unimportant in the
dynamics of the galaxy, which justifies the mean-field approximation and the use of the collisionless Boltzmann equation. Today we derive the collisionless Boltzmann equation in the context of
galaxies, formulate the self-consistent problem and outline a few analytic approaches to solving it.
The study of galactic systems — the dynamics, kinematics, morphology — is a major tool in
comprehending some of the key issues in astrophysics relating to the origin, evolution and structure
of the Universe.
In modeling of galactic systems, we move from the simplest approximations to galaxy shapes
(spherical — 1 dof) to more general (axisymmetric — 2 dof; and triaxial — 3 dof). However, we
first must establish which equations govern the dynamics of galactic systems.
The Collisionless Boltzmann Equation
Earlier, we have demonstrated that in galaxies the stellar encounters are unimportant; in other
words, the mean-free path between collisions is considerably (orders of magnitude!) longer than the
age of the Universe. This justifies the collisionless approximation and the use of the collisionless
Boltzmann equation (also known as the Vlasov equation).
Imagine a large number of stars moving under the influence of a smooth potential Φ(x, t). At
any time t, a full description of the state of any collisionless system is given by specifying the number
of stars f (x, v, t)d3 xd3 v having positions in the small volume d3 x centered on x and velocities in
the small range d3 v centered on v. The quantity f (x, v, t) is called the distribution function or
phase-space density of the system. Clearly f ≥ 0 everywhere.
If we know the initial coordinates and velocities of every star, Newton’s laws enable us to
evaluate their positions and velocities at any later time. Thus, given f (x, v, t0 ), it should be
possible to calculate f (x, v, t) for any t using only the information that is contained in f (x, v, t0 ).
Now, consider the flow of points in phase space that arises as stars move along their orbits. The
coordinates in phase-space are
(x, v) ≡ w ≡ (w1 , ..., w6 ),
(513)
so that the velocity of this flow can be written as
ẇ = (ẋ, v̇) = (ẋ, −∇Φ),
(514)
where we have used from the Hamiltonian formulation v̇ = −∇Φ.
A characteristic of the flow described by ẇ is that it conserves stars: in the absence of encounters
stars do not jump from one point in phase-space to another, but rather drift smoothly through
space. Therefore, the density of stars f (w, t) satisfies a continuity equation analogous to that
satisfied by the density ρ(x, t) of the ordinary fluid flow:
6
∂f X ∂(f ẇi )
+
= 0.
∂t
dwi
i=1
136
(515)
PHYS 652: Astrophysics
137
The physical content of this equation can be seen by integrating it over some volume of phase
space. The first term then describes the rate at which the collection of stars inside this volume is
increasing, while an application of the divergence theorem shows that the second term describes
the rate at which stars flow out of this volume.
The flow described by ẇ is very special, because it has the property that
6
X
∂ ẇi
i=1
dwi
=
3
X
∂vj
j=1
3
X ∂
∂ v̇j
−
+
=
dxj
dvj
dvj
j=1
∂Φ
dxj
= 0.
(516)
Here (∂vj /∂xj ) = 0 because vi and xi are independent coordinates of phase-space, and the last
step follows because ∇Φ does not depend on velocities. If we use eq. (516) to simplify eq. (515),
we obtain the collisionless Boltzmann equation (also known as the Vlasov equation):
6
∂f X ∂(f ẇi )
+
∂t
∂wi
i=1
6 ∂f
∂ ẇi
∂f X
+
+ ẇi
f
∂t
∂wi
∂wi
i=1
3 ∂f X
∂f
∂f
+
+ v̇i
ẋi
∂t
∂xi
∂vi
i=1
3 ∂f X
∂Φ ∂f
∂f
+
−
vi
∂t
∂xi ∂xi ∂vi
= 0
= 0
= 0
= 0
(517)
i=1
or, in vector notation
∂f
∂f
+ v · ∇f − ∇Φ ·
= 0.
(518)
∂t
∂v
Equation (518) is the fundamental equation of stellar dynamics.
The meaning of the collisionless Boltzmann equation can be clarified by extending to six diversions the concept of the convective derivative. We define
6
∂f
∂f X
df
ẇi
≡
+
.
dt
∂t
∂wi
(519)
i=1
df /dt represents the rate of change of density of phase points as seen by an observer who moves
through phase-space with a star at velocity ẇ. The collisionless Boltzmann equation is then simply
df
= 0.
dt
(520)
In words, the flow of stellar phase points through phase-space is incompressible; the phase-space
density f around the phase point of a given star always remains the same.
The Self-Consistent Problem
The collisionless Boltzmann equation does not provide the closed system of equation. In order
to have a closed system of equation, we must have as many equations as we have quantities. Here,
it means that we must relate Φ and f . The Poisson equation
∆Φ(x, t) = 4πGρ(x, t)
137
(521)
PHYS 652: Astrophysics
138
relates the mass-density ρ(x, t) to the distribution function f (x, v, t). Finally, the potential Φ(x, t)
and density ρ(x, t) are related as
Z
ρ(x, t) = f (x, v, t)d3 v,
(522)
which provides the link Φ ↔ ρ ↔ f , and closes the system of equations. Solving the system of
equations:
∂f
∂t
∂f
+ v · ∇f − ∇Φ ·
= 0,
∂v
Z
ρ(x, t) =
f (x, v, t)d3 v,
∆Φ(x, t) = 4πGρ(x, t)
(523)
simultaneously is called the self-consistent problem.
Integrals of Motion and Jeans Theorem
An integral of motion I(x, v) is any function of the phase-space coordinates (x, v) that is
constant along any orbit:
I[x(t1 ), v(t1 )] = I[x(t2 ), v(t2 )],
(524)
or
∂I ∂x
∂I ∂v
∂I
∂I
d
I[x(t1 ), v(t1 )] = 0 =
+
=v
− ∇Φ ,
dt
∂x ∂t
∂v ∂t
∂x
∂v
which satisfies the collisionless Boltzmann equation. This leads to the following theorems.
(525)
Jeans theorem. Any steady-state solution of the collisionless Boltzmann equation depends on the
phase-space coordinates only through integrals of motion in the galactic potential, and any function
of the integrals yields a steady-state solution of the collisionless Boltzmann equation.
Strong Jeans theorem. The DF of a steady-state galaxy in which almost all orbits are regular
with incommensurate frequencies may be presumed to be a function only of the three independent
isolating integrals.
In other words, the Jeans theorem tells us that if I1 ,..., I5 are five independent integrals of
motion in a given potential, then any DFs of the forms f (I1 ), f (I1 , I2 ), ..., f (I1 , ..., I5 ) are solutions
of the collisionless Boltzmann equation. The strong Jeans theorem tells us that if the potential
is regular (integrable), for all practical purposes any time-independent galaxy may be represented
by a solution of the form f (I1 , I2 , I3 ), where I1 , I2 and I3 are any three independent integrals of
motion.
For example, in a spherical system (1 dof), the DF is a function of energy: f (E); in an (integrable) axisymmetric system (2 dof), the DF is a function of energy and a z-component of the
angular momentum f (E, Lz ); and in a (integrable) triaxial systems (3 dof), the DF is a function of
energy and two more integrals of motion: f (E, I2 , I3 ). In general, integrals of motion I2 and I3 are
not known, except in very special cases (of limited physical importance). For equilibrium models
df /dt = 0, so the energy is conserved, and therefore an integral of motion.
So, how does one construct DFs for galactic models?
Analytic Solutions to the Self-Consistent Problem
138
PHYS 652: Astrophysics
139
The DFs for galactic models can be obtained analytically only for a few special cases. These
special cases are important phenomenologically and pedagogically, as they offer a “peek” into the
dynamics of galaxies. However, their physical relevance is limited, because they represent either
simple 1 dof models (spheres), or density distributions which give poor fits to the observed profiles.
From f to ρ.
As a simple spherical model (1 dof), one can start with the predefined DF f (E) and compute
the corresponding ρ. This is the most straightforward method. The drawback of this approach,
however, is that the properties of the resulting density distribution are not adjustable to fit the
observed profiles.
We start with an assumed form of the DF f , integrate to obtain ρ, and solve the Poisson
equation to get the corresponding Φ.
Define relative potential and relative energy, respectively:
Ψ ≡ −Φ + Φ0 ,
1
ǫ ≡ −E + Φ0 = Ψ − v 2 ,
2
(526)
and assume the DF of the following form:
f (ǫ) =
F ǫn−3/2 ǫ > 0,
0
ǫ ≤ 0,
(527)
where F is a constant. Then the mass-density is computed by integrating over velocities [see
eq. (522)]:
n−3/2
Z ∞ Z ∞
Z √2Ψ 1
1
3
2
2
f Ψ − 4πv v dv = 4πF
f (ǫ)d v =
ρ(x) =
v 2 dv,
(528)
Ψ − v2
2
2
0
0
0
where we have used d3 v = 4πv 2 . After introducing the variable θ, such that v 2 = 2Ψ cos2 θ, we
obtain
Z π/2
n−3/2
√
2Ψ sin θdθ =
2Ψ cos2 θ
Ψn−3/2 1 − cos2 θ
ρ(x) = 4πF
0
Z π/2
√
n
= 8 2πF Ψ
sin2n−2 θ cos2 θdθ
0
#
"Z
Z π/2
π/2
√
sin2n θdθ
sin2n−2 θdθ −
= 8 2πF Ψn
0
0
=⇒
where
ρ(x) = cn Ψn ,
(529)
(2π)3/2 n −
cn =
n!
3
2
!
F.
(530)
For cn to be finite, n > 1/2.
We now solve the Poisson equation by substituting the eqs. (526) and (529) into the eq. (521)
expressed in spherical coordinates:
1 d
2 dΦ
r
= 4πGρ
r 2 dr
dr
1 d
dΨ
− 2
r2
= 4πGcn Ψn .
(531)
r dr
dr
139
PHYS 652: Astrophysics
140
Now let
s ≡
ϕ ≡
b ≡
Then we arrive at
1 d
s2 ds
s
2 dϕ
ds
r
,
b
Ψ
,
Ψ0
1
q
4πGΨ0n−1 cn
=
.
(532)
−ϕn ϕ > 0,
0
ϕ ≤ 0,
(533)
which is the Lane-Emden equation for polytropes! Again, this second-order ODE is to be solved
with the initial conditions:
1. ϕ(0) = 1 by definition;
dϕ 2. ds = 0: no gravitational force at the center.
s=0
Table 8: Properties of the solutions to the Lane-Emden equation [γ = (n + 1)/n].
Lane-Emden index n
1≤n<5
5≤n<∞
n=∞
radius
finite
infinite
infinite
mass
finite
finite
infinite
polytropic index γ
6/5 < γ ≤ ∞
1 < γ ≤ 6/5
γ=1
One of the popular early simple models for the DF in a spherical galaxy is the solution to the
Lane-Emden equation with n = 5. It is called the Plummer model:
f (ǫ) = F ǫ7/2 ,
GM
,
Φ(r) = − √
r 2 + b2
3M b2
.
ρ(r) =
4π (r 2 + b2 )5/2
(534)
From ρ to f .
Another simple spherical model (1 dof) is obtained by starting with the predefined density ρ(r)
and compute the corresponding DF f (E).
We first invert the integral for ρ in terms of f , in order to get f in terms of ρ:
Z √2Ψ(r)
1
ρ(r) =
f (ǫ)4πv 2 dv
ǫ = Ψ(r) − v 2 , dǫ = −vdv
2
0
√ Z Ψ
√
ρ(Ψ) = 2π 2
f (ǫ) Ψ − ǫ dǫ
ǫ=0
√ Z Ψ f (ǫ)
dρ(Ψ)
√
= 4π 2
dǫ
(535)
dΨ
Ψ−ǫ
ǫ=0
140
PHYS 652: Astrophysics
141
Figure 43: Region of integration for the integral in the eq. (536).
The last line represents the Abel integral equation, which can be solved explicitly. Multiply both
sides by √ǫ 1−Ψ and integrate with respect to Ψ from 0 to ǫ0 :
0
Z
0
ǫ0
Z Ψ
√ Z ǫ0
f (ǫ)
ρ′ (Ψ)
dΨ
√
√
√
dǫ
dΨ = 2π 2
ǫ0 − Ψ
ǫ0 − Ψ 0
Ψ−ǫ
0
Z
Z
ǫ0
ǫ0
√
dΨ
p
= 2π 2
f (ǫ)dǫ
.
(ǫ0 − Ψ) (Ψ − ǫ)
ǫ
0
(536)
After setting Ψ = ǫ + (ǫ0 − ǫ) sin2 χ, the inner integral becomes
Z
π/2
0
p
2(ǫ0 − ǫ) sin χ cos χ
(ǫ0 −
ǫ) cos2 χ(ǫ
so the integral in eq. (536) becomes
Z ǫ0
f (ǫ)dǫ =
0
=⇒
f (ǫ0 ) =
0
2
− ǫ) sin χ
dχ = 2
π
= π,
2
Z ǫ0
ρ′ (Ψ)
1
√
√
dΨ,
ǫ0 − Ψ
2 2π 2 0
Z ǫ0
ρ′ (Ψ)
1
d
√
√
dΨ.
ǫ0 − Ψ
2 2π 2 dǫ0 0
Now integrate the integral in the eq. (538) by parts:
Z ǫ0
p
iǫ0 Z ǫ0
h
p
ρ′ (Ψ)
′
√
ρ′′ (Ψ) −2 ǫ0 − Ψ dΨ
−
dΨ = ρ (Ψ) −2 ǫ0 − Ψ
0
ǫ0 − Ψ
0
0
Z ǫ0
p
√
= 2ρ′ (0) ǫ0 + 2
ρ′′ (Ψ) ǫ0 − ΨdΨ,
0
141
(537)
(538)
(539)
PHYS 652: Astrophysics
142
so
f (ǫ0 ) =
′
Z ǫ0 ′′
ρ (Ψ)
1
ρ (0)
√
√
+
dΨ
√
ǫ0
ǫ0 − Ψ
2 2π 2
0
(540)
Equations (538) and (540) are two variants of Eddington’s formula.
We now apply Eddington’s formula [top line of eq. (538)] to the density used in the approach
“from f to ρ” ρ(r) = cn Ψn :
Z ǫ0
Z ǫ0
Ψn−1
Ψ
ncn
√
√
dΨ
Set t ≡
f (ǫ)dǫ =
2
ǫ0
ǫ0 − Ψ
2 2π 0
0
Z 1
n−1
n
t
ǫ0
nc
√n
dt
=
√ √
2
ǫ0 1 − t
2 2π 0
ncn n−1/2
1
ncn Γ(n)Γ 12 n−1/2
ǫ0
√
(541)
=
ǫ0
β n,
= √
2
2 2π 2
2 2π 2 Γ n + 12
because Γ
1
2
=
f (ǫ0 ) =
√
π. [Recall Γ(n) = (n − 1)!]. Now differentiate to get
nc
√n
2 2π 2
1
n−
2
√
(n − 1)! π n−3/2
n!cn
ǫ0
ǫn−3/2 = F ǫ0n−3/2 . (542)
=
1
3/2
(2π)
n − 32 ! 0
n− 2 !
Therefore, we recover the DF used in the approach “from f to ρ”, as we should.
Separable (Stäckel) potentials.
Separable (Stäckel) potentials are a spacial family of 3D potentials for which the equations of
motion separate — and are explicitly known — in ellipsoidal coordinates (λ, µ, ν), defined as the
roots of the equation:
y2
z2
x2
+
+
= 1,
(543)
τ +α τ +β τ +γ
where (x, y, z) are Cartesian coordinates and α, β and γ are constants determining the triaxial
shape of the model. We adopt a convention 0 ≤ −γ ≤ ν ≤ −β ≤ µ ≤ −α ≤ λ.
All three integrals of motion have an analytic representation, as well as the density, potential
and the DFs. Orbits in these potentials are combinations of oscillations and rotations in ellipsoidal
coordinates. They are either tubes (along short and long axes) or boxes.
Whereas the separable potentials are not a very good fit to the observed galaxy density profiles
(and are therefore of limited use in practice), they provide us with insight into the dynamics of
triaxial systems: the orbits in other, physically more faithful integrable potentials, are generally of
the same type as in separable potentials. For more on separable potentials, see the seminal paper by
de Zeeuw (1985, MNRAS, 216, 273): http://adsabs.harvard.edu/abs/1985MNRAS.216..273D
142
PHYS 652: Astrophysics
26
143
Lecture 26: Galaxies: Numerical Models
“All science is either physics or stamp collecting.”
Ernest Rutherford
The Big Picture: Last time we derived the collisionless Boltzmann equation in the context of
galaxies, formulated the self-consistent problem and outlined a few analytical approaches to solving
it. In search of a physically more faithful model of realistic galaxies, today we talk about numerical
simulations. We outline the main approaches, along with their advantages and disadvantages.
Numerical Simulations of Galaxies
Realistic galaxy models — which often include non-integrable and time-dependent potentials in
3 dof — are not analytically tractable. Numerical simulations are our only hope in understanding
the fundamental aspects of the underlying dynamics of these systems, such as:
• the non-linear collective phenomena leading to small-scale structure (central cusps, globular
clusters, bars, arms, etc...);
• mechanisms which drive the system toward equilibrium;
• correlation between physical properties of the galaxy (size, luminosity, mass of the central
supermassive black hole, velocity dispersion, etc...), as hints about the galaxy evolution.
The numerical techniques invoked in simulating galaxies differ in their implementation of the
physical problem. N-body simulations attempt to solve the physical problem in a direct way: particles interacting with each other via gravitational 1/r 2 force. The Schwarzschild orbit superposition
method assumes time-independent system (in equilibrium), and solves the self-consistent problem.
Distinguishing between numerical artifacts and physics intrinsic to these multiparticle systems becomes a major challenge. We now discuss each one of these approaches in some detail.
N-Body Simulations
In N-body simulations, the N “macroparticles” sampling the initial DF are evolved under each
other’s gravitational influence. Implementing a perfectly faithful representation of the physical
system is computationally prohibitive because of the two main reasons:
1. Size of the system: the number of “particles” (stars) in a realistic galaxy is huge: N ≈ 1012 ;
2. Scaling of the interaction: because gravity is a force with an infinite range each star ”feels”
gravitational force due to each other star in the system, which means that the number of
interactions scales as O(N 2 ).
The three main types of N-body codes: (i) direct summation, (ii) tree, and (iii) particle-in-cell,
invoke different approximations to deal with these problems.
Direct summation samples the initial DF by Npart macroparticles and evolves them via particleto-particle interaction.
• Advantage: The implementation is closest to the physical problem (individual particles interacting with each other).
143
PHYS 652: Astrophysics
144
• Disadvantages:
1. Problem scales as O(N 2 ), which becomes computationally prohibitive quite quickly.
2. Particle collisions become a computational “bottleneck”, because the timestep of evolution of the system is the smallest needed to preserve predefined accuracy. When two
macropartcles get very close to each other, the forces become quite large and accuracy
is compromised, prompting for ever-decreasing timestep (until finally the systems comes
2
to a complete halt).
√ This problem can be alleviated either by: (i) softening of the 1/r
power law to 1/ r 4 + b4 (effectively making the particles miniature spheres, as opposed
to point-particles); or (ii) regularization: changing to a different (non-singular) set of
variables locally when particles get “dangerously close” to impact.
The number of macroparticles Npart is orders of magnitude smaller than the number of particles in
the system N , which introduces unphysical forces and noise.
Tree codes are a variation on the direct summation: it uses direct summation for particles nearby,
and invokes a statistical treatment of effect of far-away particles.
• Advantage: The implementation is still close to the physical problem (individual particles
interacting with each other).
• Disadvantage: Although the scaling of the interactions are better than O(N 2 ), it is still
expensive.
Particle-in-cell codes solve the self-consistent problem in which the DF is represented by a
collection of Npart macroparticles, on a finite discrete computational grid.
• Advantages:
1. Scales as O(k1 Npart ) + O(k2 Ngrid ), where k2 ≫ k1 (so, in most applications, it scales as
O(Ngrid ), where Ngrid is the number of gridpoints).
2. Allows for more lot more macroparticles Npart .
• Disadvantage: Introduces discretization noise due to finiteness and discreteness of the computational domain.
144
PHYS 652: Astrophysics
145
Recently, the Beam Physics and Astrophysics Group at NICADD has been involved in developing
a new variant of particle-in-cell solvers which use wavelets to remove some of the numerical noise
intrinsic to the method (http://www.nicadd.niu.edu/∼bterzic/Research/TPB 2007.pdf).
Schwarzschild’s Orbit Superposition Method
Figure 44:
Flow-chart for modeling galaxies using Schwarzschild’s method.
The reference
is Chandrasekhar 1969, Ellipsoidal Figures of Equilibrium, Dover, New York.
For details
on modeling individual galaxies by fitting them to a new family of mass-density profiles, see
http://www.nicadd.niu.edu/∼bterzic/Research/TG 2005.pdf.
Schwarzschild’s orbit superposition method divides the model into cells of a 3D sphere. Based
on the amount of time it spends in each of the cells i, the orbital density template ρij for each
orbit is computed. Now, we seek the set of non-negative weights wi for each of the orbits, such
that the weighted sum of all the orbital densities of the model will reproduce the starting density
distribution of the model ρi in each of the i cells. That is,
ρi =
No
X
wj ρij ,
(544)
j=1
where No is the number of orbits and the normalized orbital densities are given by
1=
Nc
X
ρi ,
(545)
i=1
with Nc being the number of cells in a 3D sphere. Equations (544) and (545) constitute an optimization problem and can be solved in several ways, the most popular of which are the linear
programming or least squares methods.
145
PHYS 652: Astrophysics
146
Optimization problem.
Schwarzschild’s method is formulated as an optimization problem:
minimize :
subject to :
f (wi ),
No
P
wi ρij = ρj ,
j = 1, 2, ..., Nc ,
(546)
i=1
wi ≥ 0,
i = 1, 2, ..., No ,
where f (wi ) is the cost function, ρij is the contribution of the orbital density of the ith template to
jth cell, ρj is the model’s density in the jth cell and wi is the orbital weight of the ith orbit. The
problem above becomes a linear programming problem (LPP) when the cost function is a simple linear function of the weights; for example, to minimize weights of orbits labeled from m to n, the cost
n
P
function would simply be f (wi ) =
wi . The solutions of the LPP are often quite noisy, with entire
i=m
ranges of orbits carrying zero weights. It is often customary to impose additional constraints in order to “smoothen” out the solutions, such as minimizing the sum of squares of orbital weights (which
makes this a quadratic programming problem) or minimizing the least squares. (For an pedagogical
and detailed discourse on the implementation of the Schwarzschild’s method for a special case of
scale-free potentials, see http://www.nicadd.niu.edu/∼bterzic/Research/chapter3.pdf).
Chaotic orbits.
The orbital density templates ρij are computed so as to represent the time-averaged orbital density
of stars on that orbit, thus making them time-independent building blocks of a time-independent
solution to the self-consistent problem. Chaotic orbits (to be defined later in this lecture) cannot
have their individual orbital density templates included into Schwarzschild’s method because their
time-averaged density would change over time. Instead, chaotic orbits are usually averaged out into
a single chaotic super-orbit orbital template and then included in Schwarzschild method. This is
because chaotic portion of the phase space in 3 dof (and higher) are interconnected (Arnold’s web),
so all chaotic orbits in a given potential can be viewed as parts of one large chaotic super-orbit (i.e.,
if integrated long enough — infinitely long — each chaotic orbit will sample all of the available
chaotic phase-space).
Chaos in Galactic Simulations
Decades of numerical simulations have shown that realistic galactic models feature a large number of chaotic orbits. As a case in point, even simple dynamical systems such as the gravitational
(restricted) three-body problem features a large portion of chaotic orbits. Another example is a
numerical simulation of a 10-body model of a solar system, which found e-folding times for each
of the planets’ orbits in the range of 10 − 50 million years (Laskar 1993, Physica D, 67, 257). It
is then quite reasonable to expect that N-body simulations for which N ≫ 10 will feature chaotic
orbits.
In simulations which smooth over particle distribution by invoking a mean-field approximation,
such as integration of orbits in a smooth potential, presence of chaos is not nearly as obvious. The
presence of chaos has only been discovered after the integration of orbits revealed that the number
of integrals of motion was fewer than the number of degrees of freedom (Henon & Heiles 1963,
Astronomical Journal, 69, 73).
Definition of chaos: Motion which exhibits sensitive dependence on initial conditions.
In other words, nearby orbits will diverge exponentially:
d(t) = d(0)eλt ,
146
(547)
PHYS 652: Astrophysics
147
where d(0) is the initial separation of nearby orbits, d(t) is the separation of initially nearby orbits
at some later time t, and λ is the Lyapunov exponent.
The Lyapunov exponent is defined as
λ=
1 d(t)
ln
,
d(0)
d(0)→0 t
lim
t→∞,
(548)
and is related to the “e-folding time” τe as τe = 1/λ. The e-folding time denotes a time-scale
after which one can no longer make quantitative predictions about the system. In other words —
loosely speaking — it is the time-scale after which the motion on the same orbit will be completely
uncorrelated.
Here a note of caution is appropriate: the colloquial use of the term “chaotic” has led to
a common misconception that chaos implies complete randomness. This is not the case: chaos
implies intrinsic inability to quantify the system beyond the e-folding time.
Regular motion is characterized by vanishing Lyapunov exponents. Orbits are well-defined, have
localized Fourier spectra, and “appear” regular (”quasi-periodic”). Regular motion in an N -dof
system is confined by its three integrals of motion to the surface of the N -dimensional torus residing
in the 2N -dimensional phase-space.
Chaotic (stochastic, irregular) motion is characterized by non-zero Lyapunov exponents. Orbits are not well-defined, have “fuzzy” Fourier spectra, and generally “appear” irregular, but not
always: “weakly chaotic” or “sticky” orbits can mimic regular behavior for long periods of time,
only to become “wildly chaotic” at later times (short-time Lyapunov exponents can vary drastically). Chaotic motion in an N -dof system is not confined to the surface of the N -dimensional
torus residing in the 2N -dimensional phase-space, because it does not have N integrals of motion.
Integrable potential have as many integrals of motion as the degrees of freedom. All orbits are
regular. Examples of integrable potentials include all spherically symmetric systems (there is no
chaos in 1D) and some axisymmetric potentials (2 dof).
Non-integrable potential do not have as many global integrals of motion as the degrees of
freedom. However, the presence of local integrals of motion is possible, so there are generally both
chaotic and regular orbits.
Relaxation of Multiparticle Systems
Earlier we computed time needed for the system to reach equilibrium (relaxation time) through
collisions (close encounters) to be orders of magnitude longer than the Hubble time. This means
that if close encounters was the only relaxation mechanism at work, we should observe galaxies to
be far from equilibrium. Observations show the contrary: galaxies are to a good approximation
relaxed systems, in (or at least close to) equilibrium. It then became clear that there are other
mechanisms at work in driving the system toward equilibrium.
There are several mechanisms believed to be at work in galactic systems, as seen in copious
numerical studies.
Regular phase mixing (Landau damping) is present in both time-independent and timedependent systems. It causes ensembles of regular orbits to spread out because of initial spread in
their integrals of motion. If one imagines that nearby orbits reside on slightly different tori, their
consequent evolution along the surfaces of their respective tori will result in their shear separation
(Fig. 46, top panel). The timescale for regular phase mixing depends on: (i) the size of the ensemble
147
PHYS 652: Astrophysics
y
y
z
z
y
y
x
z
x
x
z
x
z
z
x
y
h)
z
x
z
z
x
y
g)
y
f)
y
e)
y
x
y
y
Figure 45:
x
z
y
x
x
z
x
z
z
x
x
z
x
z
z
x
d)
y
c)
y
b)
y
a)
148
y
Some of the most common orbits in scale-free potentials. Major orbital families: a) regular
box, b) chaotic box, c) regular long-axis tube, d) regular short-axis tube. Minor resonant families: e) x-y
fish, f) x-z fish, g) x-y pretzel, h) x-z pretzel. (From Terzić 2002, PhD thesis, Florida State University.
http://www.nicadd.niu.edu/∼bterzic/Research/dissertation.pdf).
148
PHYS 652: Astrophysics
149
in phase space; (ii) crossing time for the ensemble. Generally speaking, regular phase mixing is
not a very powerful mechanism, but is the only mechanism driving the integrable systems toward
equilibrium.
Chaotic phase mixing (non-linear Landau damping) occurs in both time-independent and
time-dependent systems. Numerical simulations show that a microscopic ensemble of isoenergetic
test particles in a realistic galaxy potential around a chaotic orbit will mix on timescales t ∼
30 − 100tcross . The ensemble will evolve to uniformly fill the isoenergy surface accessible to it. In
systems in which a large fraction of the phase-space is occupied with chaotic orbits, chaotic mixing
could be an important mechanism for driving secular evolution on timescales much shorter than
tcollision (Lynden-Bell 1967, MNRAS, 136, 101; Merritt & Valluri 1996, Astrophysical Journal, 471,
82).
Figure 46: Regular phase mixing (top) and chaotic phase mixing (bottom). (From Merritt & Valluri 1996,
Astrophysical Journal, 471, 82)
Violent relaxation occurs only in time-dependent potentials. According to the virial theorem
1 d2 I
= 2T + V,
2 dt2
(549)
so that 2T /V = 1 for a self-gravitating system in dynamical equilibrium. A system out of equilibrium will undergo oscillations during which the particles will exchange energy with the background
149
PHYS 652: Astrophysics
150
Figure 47: Chaotic phase mixing. (From Merritt & Valluri 1996, Astrophysical Journal, 471, 82)
potential:
dE
dt
dΦ
= −
dt
*
+−1/2
dE 2
Tr =
dt
E2
=
*
+−1/2
dΦ 2
dt
E2
(550)
which leads to (Lynden-Bell 1967, MNRAS, 136, 101)
Tr ≃
3P
,
8π
(551)
where P is the typical radial period of the orbit of a star. The violently changing gravitational field
of a newly formed galaxy is effective in driving the stellar orbits toward equilibrium on timescales
much shorter that the Hubble time. For a discussion of orbital structure — both regular and
chaotic — in time-dependent galactic potentials modeling conditions during violent relaxation, see
http://www.nicadd.niu.edu/∼bterzic/Research/TK 2005.pdf.
150
PHYS 652: Astrophysics
1
Appendix to Lecture 2
An Alternative Lagrangian
In class we used an alternative Lagrangian
L = gγδ ẋγ ẋδ ,
instead of the traditional
L=
q
gγδ ẋγ ẋδ .
Here is the justification why either works correctly, i.e., why the expression given in eq. (552) is a
Lagrangian that generates the geodesic equation.
We prove that by applying the Lagrange’s equations
∂L
d ∂L
−
= 0.
α
∂x
dλ ∂ ẋα
to the expression in eq. (552), and recovering the geodesic equation.
∂L
= gγδ,α ẋγ ẋδ ,
∂xα
∂L
= gαδ ẋδ + gγα ẋγ
α
∂
ẋ
∂L
d
= gαδ,γ ẋδ ẋγ + gαδ ẍδ + gγα,δ ẋγ ẋδ + gγα ẍγ
dλ ∂ ẋα
= (gαδ,γ + gγα,δ ) ẋδ ẋγ + 2gαδ ẍδ ,
because we are at liberty to rename dummy variables (ones which are summed over), and to
exchange indices of the metric tensor, since it is symmetric. The Lagrange equation therefore
reads:
∂L
d ∂L
−
α
∂x
dλ ∂ ẋα
= (gαδ,γ + gγα,δ ) ẋδ ẋγ + 2gαδ ẍδ − gγδ,α ẋγ ẋδ =
= (gαδ,γ + gγα,δ − gγδ,α ) ẋδ ẋγ + 2gαδ ẍδ = 0.
Now multiply both sides by 12 gνα to isolate the second derivative term:
1
ẍν + g να (gαδ,γ + gγα,δ − gγδ,α ) ẋδ ẋγ = 0.
2
But, by definition
1
Γνβγ = gνα (gαδ,γ + gγα,δ − gγδ,α ) ,
2
so, we finally have
ẍν = −Γνβγ ẋδ ẋγ ,
which is the geodesic equation we derived in class (eq. (31)). This proves that the Lagrangian in
eq. (552) also generates the geodesic equation (the factor of 1/2, or any other positive constant,
does not affect the Lagrange’s equations).
Example of Metric Conversion
1
PHYS 652: Astrophysics
2
Let us see how convert from one space metric to another, i.e., use eq. (4).
For example, given the space metric in Cartesian coordinates (x1 , x2 , x3 ) = (x, y, z)


1 0 0
δij =  0 1 0  .
0 0 1
let us find the space metric in spherical coordinates (x′1 , x′2 , x′3 ) = (r, θ, φ). Cartesian coordinates
are given in terms of spherical as:
x = r sin θ cos φ,
y = r sin θ sin φ,
z = r cos θ,
or
x1 = x′1 sin x′2 cos x′3 ,
x2 = x′1 sin x′2 sin x′3 ,
x3 = x′1 cos x′2 .
Then,
∂x1
= sin x′2 cos x′3 ,
∂x′1
∂x2
= sin x′2 sin x′3 ,
∂x′1
∂x1
= x′1 cos x′2 cos x′3 ,
∂x′2
∂x1
= −x′1 sin x′2 sin x′3 ,
∂x′3
∂x2
= x′1 cos x′2 sin x′3 ,
∂x′2
∂x2
= x′1 sin x′2 cos x′3 ,
∂x′3
∂x3
= cos x′2 ,
∂x′1
∂x3
= −x′1 sin x′2 ,
∂x′2
2
∂x3
= 0.
∂x′3
PHYS 652: Astrophysics
3
From eq. (4), we have
ds2 = δij dxi dxj
∂xi ∂xj
= δij ′k ′l dx′k dx′l
∂x ∂x
= (dx′1 )2 sin2 x′2 cos2 x′3 + sin2 x′2 sin2 x′3 + cos2 x′2
+ (dx′2 )2 (x′1 )2 cos2 x′2 cos2 x′3 + (x′1 )2 cos2 x′2 sin2 x′3 + (x′1 )2 sin x′3
+ (dx′3 )2 (x′1 )2 sin2 x′2 sin2 x′3 + (x′1 )2 sin2 x′2 cos2 x′3
= (dx′1 )2 + (x′1 )2 (dx′2 )2 + (x′1 )2 sin2 x′2 (dx′3 )2
= dr 2 + r 2 dθ 2 + r 2 sin2 θdφ2
= p11 (dr)2 + p22 (dθ)2 + p33 (dφ)2
= p11 (dx′1 )2 + p22 (dx′2 )2 + p33 (dx′3 )2 = pij dx′i dx′j .
Reading off diagonal components of the metric, we have
p11 = 1,
p22 = r 2 ,
p33 = r 2 sin2 θ,
so, the space metric for spherical coordinates is


1 0
0

pij =  0 r 2
0
or
2
2
0 0 r sin θ

1
ij

0
p =
0
0
1
r2
0
0
0
1
r 2 sin2 θ

.
Deriving the geodesic equation in spherical coordinates. Let us now compute the geodesic
in 3D flat space, expressed in spherical coordinates. This should be an analog to geodesics in flat
space in Cartesian coordinates:
ẍα = 0.
This can be done in at least two ways.
Method 1: Brute force – computing Christoffel symbols and substituting them into the geodesic
equation. From the eq. (14), Christoffel symbols for the spherical space are given by
1
Γkij = pkl (pil,j + plj,i − pij,l ) .
2
Since p11 = 1 all of its derivatives vanish. Also, because of symmetry (look at the definition given
in eq. (14) and recall that the metric tensor is symmetric). Therefore, we have
Γk1j
=
Γkj1
1 kl
1 k2
1 kl
k3
p p2j,1 + p p3j,1 ,
= p (p1l,j + plj,1 − p1j,l ) = p plj,1 =
2
2
2
3
PHYS 652: Astrophysics
4
and
Γ11j = Γ1j1 = 0,
1 1l
1
Γ122 =
p (p2l,2 + pl2,2 − p22,l ) = − p11 p22,1 = −r,
2
2
1
1 1l
1
1
Γ23 = Γ32 = p (p2l,3 + pl3,2 − p23,l ) = p1l pl3,2 = 0,
2
2
1
1
Γ133 =
p1l (p3l,3 + pl3,3 − p33,l ) = − p11 p33,1 = −r sin2 θ,
2
2
1 22
2
Γ1j =
p p2j,1 ,
2
Γ211 = 0,
1
1 1
1
Γ212 = Γ221 = p22 p22,1 =
2r = ,
2
2
2r
r
Γ213 = Γ231 = 0,
because p12 = 0, p13 = 0,
because p23 = 0, pij,3 = 0,
Γ222 = 0,
1
1
Γ223 = Γ232 = p2l (p2l,3 + pl3,2 − p23,l ) = p22 (p22,3 + p23,2 − p23,2 ) = 0,
2
2
1 2l
1
1 1
Γ233 =
p (p3l,3 + pl3,3 − p33,l ) = − p22 p33,2 = − 2 (2r 2 sin θ cos θ) = − sin θ cos θ,
2
2
2r
1 3l
1 33
1
3
Γij =
p (pil,j + plj,i − pij,l ) = p (pi3,j + p3j,i − pij,3 ) = p33 (pi3,j + p3j,i ) ,
2
2
2
3
Γ11 = 0,
Γ312 = Γ321 = 0,
1
1
1
1
1
(2r sin2 θ) = ,
Γ313 = Γ331 = p33 (p13,3 + p33,1 ) = p33 p33,1 =
2
2
2
2
2 r sin θ
r
3
Γ22 = 0,
1
1
1
1
(2r 2 sin θ cos θ) = cot θ,
Γ323 = Γ332 = p33 (p23,3 + p33,2 ) = p33 p33,2 =
2
2
2
2 r sin2 θ
1 33
Γ333 =
p (p33,3 + p33,3 ) = 0.
2
Geodesic equation in spherical coordinates then becomes (recall x′1 = r, x′2 = θ, x′2 = φ):
ẍ′1 = r̈ = −Γ1γδ ẋ′γ ẋ′δ = −Γ122 (ẋ′2 )2 − Γ133 (ẋ′3 )2
= r θ̇ 2 + r sin2 θ φ̇2 ,
ẍ′2 = θ̈ = −Γ2γδ ẋ′γ ẋ′δ = −2Γ212 ẋ′1 ẋ′2 − Γ233 (ẋ′3 )2
1
= −2 ṙ θ̇ + sin θ cos θ φ̇2 ,
r
′3
ẍ = φ̈ = −Γ3γδ ẋ′γ ẋ′δ = −2Γ313 ẋ′1 ẋ′3 − 2Γ323 ẋ′2 ẋ′3
1
= −2 ṙ φ̇ − 2 cot θ θ̇φ̇,
r
Method 2: Using a Lagrangian L = gγδ ẋγ ẋδ . The alternative Lagrangian mentioned earlier
becomes
L = pij ẋi ẋj = ṙ 2 + r 2 θ̇ 2 + r 2 sin2 θ φ̇2 ,
so applying the Lagrange equations
d ∂L
∂L
−
= 0,
l
∂x
dλ ∂ ẋl
4
PHYS 652: Astrophysics
5
yields, for each coordinate r, θ, φ:
d ∂L
∂L
−
∂r
dλ ∂ ṙ
d ∂L
∂L
−
∂θ
dλ ∂ θ̇
∂L
d ∂L
−
∂φ dλ ∂ φ̇
= 2r θ̇ + 2r sin2 θ φ̇2 − 2r̈ = 0,
=⇒
= 2r 2 sin θ cos θ φ̇2 − 4r ṙθ̇ − 2r 2 θ̈ = 0,
=⇒
= −4r ṙ sin2 θ φ̇ − 4r 2 sin θ cos θ θ̇φ̇ − 2r 2 sin2 θ φ̈ = 0,
=⇒
r̈ = r θ̇ 2 + r sin2 θ φ̇2 ,
1
θ̈ = −2 ṙθ̇ + sin θ cos θ φ̇2 ,
r
1
φ̈ = −2 ṙφ̇ − 2 cot θ θ̇φ̇.
r
This set of equations represents motion in flat space, as described by spherical coordinates, and
therefore should describe straight lines. This is fairly easy to see for purely radial motion in the
x − y plane, θ = π/2 and φ = const., so the RHS of all three geodesic equations above vanish, and
we recover a straight (radial) line r̈ = 0. In a more general case, it is less trivial to show that the
equations above represent straight lines.
As mentioned in class, using this alternative Lagrangian allows one to readily read off Christoffel
symbols. From the equation above, they are readily identified as
Γ122 = −r,
Γ133 = −r sin2 θ,
1
Γ212 = Γ221 = ,
r
Γ233 = − sin θ cos θ,
1
Γ313 = Γ331 = ,
r
3
3
Γ23 = Γ32 = cot θ,
just as we computed by brute force. The factor 2 in front of Christoffel symbols Γijk which have
unequal lower indices (j 6= k) reflects the fact that because of symmetry both Γijk and Γikj are
counted.
p It is not advisable to compute the geodesic equation from the traditional Lagrangian L =
gγδ ẋγ ẋδ , as it will quickly lead to some extremely cumbersome algebra. The three Lagrange’s
equation should eventually reduce to the geodesic equations we derived above (because the two are
equivalent in terms of producing the same result) but it quickly becomes obvious which approach
is preferable.
Applying the Geodesic Equation
Let us compute the geodesic equation on the surface of the 3D sphere. The radius is then
constant r = R, the coordinates are (x1 , x2 ) = (θ, φ), and the metric is
2
R
0
pij =
.
0 R2 sin2 θ
The Lagrangian again is
L = pij ẋi ẋj = R2 θ̇ 2 + R2 sin2 θ φ̇2 ,
where i, j = 1, 2. Applying the Lagrange equations
d ∂L
∂L
−
= 0,
l
∂x
dλ ∂ ẋl
5
PHYS 652: Astrophysics
6
yields, for each coordinate θ and φ:
∂L
d ∂L
−
∂θ
dλ ∂ θ̇
∂L
d ∂L
−
∂φ dλ ∂ φ̇
= 2R2 sin θ cos θ φ̇2 − 2R2 θ̈ = 0,
=⇒
θ̈ = sin θ cos θ φ̇2 ,
= −4R2 sin θ cos θ θ̇φ̇ − 2r 2 sin2 θ φ̈ = 0,
=⇒
φ̈ = −2 cot θ θ̇φ̇.
The second equation reduces to
2
φ̈ − 2 cot θ θ̇φ̇ = 0
φ̈ sin θ − 2 sin θ cos θ θ̇φ̇ = 0
d φ̇ sin2 θ
= 0,
dt
where the the conserved term in parentheses is the angular momentum.
We know that the geodesics on the surface of the sphere must be a part of a great circle – the
circle which contains the two points and whose radius is the radius of the sphere (its center also
coincides with the center of the sphere). We can check the two special cases, and make sure they
are correct:
1. Equator: for the two points along the equator the shortest distance will be also along the
equator. We need to show that such a curve φ = c1 λ + φ0 , and θ = π/2 satisfies the geodesic
equation. Plug
φ = c1 λ + φ0 ,
π
,
θ =
2
in the geodesic equation and obtain
φ̇ = c1 ,
φ̈ = 0,
θ̇ = 0,
θ̈ = 0,
π
π
cos c1 2 = 0,
2
2
π
φ̈ = −2 cot θ θ̇φ̇ = −2 cot 0c1 = 0.
2
So, the equator is a geodesic.
θ̈ = sin θ cos θ φ̇2 = sin
6
PHYS 652: Astrophysics
7
2. Meridian: for the two points along the same meridian (arc of the great circle connecting the
two poles) the shortest distance should also be along the meridian. We need to show that
such a curve φ = φ0 , and θ = c2 λ + θ0 satisfies the geodesic equation. Plug
φ = φ0 ,
φ̇ = 0,
φ̈ = 0,
θ = c2 λ + θ0 ,
θ̇ = c2 ,
θ̈ = 0,
in the geodesic equation and obtain
θ̈ = sin θ cos θ φ̇2 = sin (c2 λ + θ0 ) cos (c2 λ + θ0 ) 02 = 0,
φ̈ = −2 cot θ θ̇φ̇ = −2 cot (c2 λ + θ0 ) c2 0 = 0.
So, the meridian is a geodesic.
7
PHYS 652: Astrophysics
1
Appendix to Lecture 6
Matter–Dark Energy Equality
In class, a question was raised of when was the energy density of matter equal to the “vacuum”
(dark) energy density.
This can be computed easily after recalling that
ρde = const. = ρde0 ,
ρm a3 = const.,
ρm a3 = ρm0 a30
⇒
⇒
ρm = ρm0 a−3 ,
after noting that, by convention, a0 = 1. So, the two energy densities are equal at aeq2 when
ρde
ρde0
,
=
ρm
ρm0 a−3
eq2
0.28 1/3
ρm0 1/3
=
= 0.73.
⇒ aeq2 =
ρde0
0.72
1
=
So, the energy density of matter and the energy density of dark energy were equal when the Universe
was 0.73 — almost 3/4 — of its size today.
To compute how long ago this took place, we can compute the age of the Universe at aeq2 from
eq. (156)
Z 1
Z 1
da
a1/2 da
q
p
H 0 t0 =
=
1−Ωde0
(1 − Ωde0 ) + Ωde0 a3
0
0
+ Ωde0 a2
a
h p
i1
p
2
√
Ωde0 a3 + Ωde0 (a3 − 1) + 1 ,
ln 2
=
3 Ωde0
0
by changing the upper limits of integration from t0 and a(t0 ) = 1 to t1 and a(t1 ) ≡ aeq2 :
H 0 t1 =
=
=
Z
aeq2
da
q
=
Z
aeq2
a1/2 da
p
(1 − Ωde0 ) + Ωde0 a3
0
+ Ωde0 a2
h p
iaeq2
p
2
3
3
√
ln 2
Ωde0 a + Ωde0 (a − 1) + 1 3 Ωde0
0
r

q
3
3
 Ωde0 aeq2 + Ωde0 aeq2 − 1 + 1 
2
.
√
√
ln 

3 Ωde0 
1 − Ωde0
0
1−Ωde0
a
So, for the observed parameters of Ωde0 = 0.72 and the computed value of aeq2 = 0.73, we obtain
q
√

3+
0.72
0.73
0.72 0.733 − 1 + 1
2
2
=
√
√
√
ln 
(0.881).
t1 =
3H0 Ωde0
3H0 Ωde0
1 − 0.72
We compare this to the age of the Universe computed earlier in eq. (157)
√
2
1 + Ωde0
2
√
√
t0 =
ln √
(1.25) = 13.7A.
=
3H0 Ωde0
1 − Ωde0
3H0 Ωde0
1
PHYS 652: Astrophysics
2
to finally obtain
t1
t0
0.867
=
⇒
t1 =
t0 = 0.69t0 = 9.65A.
0.867
1.25
1.25
So, the Universe was 9.65 billion years old when energy densities of matter and dark energy were
equal. That was 13.7 − 9.65 = 4.05 billion years ago.
aeq
20
aeq2
radiation
log10[ρ(t)/ρcr]
15
10
matter
5
dark energy (Λ)
0
-5
1e-06
1e-05
1e-04
0.001
0.01
a(t)
0.1
1
10
100
today
Figure 48: Three epochs in the evolution of the Universe: (1) radiation-dominated a < aeq , (2) matterdominated aeq < a < aeq2 , (3) dark energy-dominated a > aeq2 . For the preview of what processes are
occurring in each of these epochs, see Fig. 1.15 in the textbook.
2
PHYS 652: Astrophysics
1
Appendix to Lecture 9
Radiation–Dark Energy Equality
In class, a question was raised of when was the energy density of matter equal to the “vacuum”
(dark) energy density.
The total energy density of radiation is the sum of the energy density of CMB photons, given
in eq. (209), and the energy density of neutrinos, given in eq. (224), while the energy density of
dark energy is Ωde = Ωde0 = const. We then have:
−5
−5
2.47×10
+ 1.65×10
Ωγ + Ων
Ωr
h2 a4
h2 a4
=
=
1 =
Ωde0
Ωde0
Ωde0
1/4
−5
4.12 × 10
⇒ aeq3 =
(≈ 0.1)
0.72 0.722
−1/4
4.12 × 10−5
(≈ 10) .
⇒ 1 + zeq3 =
0.72 0.722
=
4.12 × 10−5
Ωde0 h2 a4
where the numbers in parenthesis are given for Ωde0 = 0.72 and h = 0.72.
From Friedmann’s first equation:
Solving for ȧ, this becomes
2
ȧ
= H02 Ωm0 a−3 + Ωr0 a−4 + Ωde0 .
a
ȧ = H0
r
Ωm0 Ωr0
+ 2 + Ωde0 a2 ,
a
a
and
H0 teq3 =
Z
aeq3
0
da
q
Ωm0
a
+
Ωr0
a2
=
+ Ωde0 a2
Z
aeq3
0
a da
p
Ωm0 a + Ωr0 + Ωde0 a4
so the age of the Universe for Ωde0 = 0.72, Ωm0 = 0.28, and Ωr0 = 4.12 × 10−5 /h2 = 7.9 × 10−5 at
aeq3 is (after using Maple to perform the calculation):
teq3 ≈ 0.54 A ≈ 5.4 × 108 years = 540 million years.
Matter-Radiation Equality
It is beneficial to compute at which point the energy densities of matter and radiation were
equal, because that was the point of transition between these two different regimes. This point is
called matter-radiation equality. The significance of this transition is that the perturbations in the
two regimes grow at different rates, as we will see later.
We find the value of the scale factor a(t) = aeq at which the energy densities of matter and
radiation were equal by setting their ratio to unity and solving for a.
The total energy density of radiation is the sum of the energy density of CMB photons, given
in eq. (209), and the energy density of neutrinos, given in eq. (224), while the energy density of
1
PHYS 652: Astrophysics
2
baryons is given in eq. (232). We then have:
−5
−5
2.47×10
+ 1.65×10
Ωγ + Ων
4.12 × 10−5
h2 a4
h2 a4
1 =
=
=
Ωm
Ωm0 a−3
Ωm0 h2 a
−5
4.12 × 10
⇒ aeq =
= 2.84 × 10−4
2
Ωm0 h
⇒ 1 + zeq = 2.43 × 104 Ωm0 h2 = 3.52 × 103 .
where the numbers in parenthesis are given for Ωm0 = 0.28 and h = 0.72. We will see later that the
photons decouple from matter around z ≈ 103 , after the matter-radiation equality, which means
that the decoupling takes place in a matter-dominated Universe.
Let us now estimate how old the Universe was when this happened. From Friedmann’s first
equation:
2
ȧ
= H02 Ωm0 a−3 + Ωr0 a−4 + Ωde0 .
a
Solving for ȧ, this becomes
ȧ = H0
r
Ωm0 Ωr0
+ 2 + Ωde0 a2 ,
a
a
and
H0 teq =
Z
0
aeq3
da
q
Ωm0
a
+
Ωr0
a2
=
+ Ωde0 a2
Z
aeq3
0
a da
p
Ωm0 a + Ωr0 + Ωde0 a4
so the age of the Universe for Ωde0 = 0.72, Ωm0 = 0.28, and Ωr0 = 4.12 × 10−5 /h2 = 7.9 × 10−5 at
at aeq is (after using Maple to perform the calculation):
t0 ≈ 4.8 × 10−5 A = 4.8 × 104 years ≈ 50000 years.
2
PHYS 652: Astrophysics
3
t=5 104 years
t=5.4 108 years t=9.65 109 years
20
radiation
15
log10 Ω(t)
10
matter
5
dark energy (Λ)
0
-5
1e-06
1e-05
0.0001
0.001
0.01
a(t)
0.1
1
10
today: t=13.7 109 years
100
Figure 49: Three epochs in the evolution of the Universe: (1) radiation-dominated a < aeq , (2) matterdominated aeq < a < aeq2 , (3) dark energy-dominated a > aeq2 .
3
PHYS 652: Astrophysics
4
Chemical Potential
The distribution function for species for both fermions and bosons is given by
f=
1
,
e(E−µ)/T ± 1
(+ for fermions and - for bosons). For a thermal background radiation, the chemical potential
µ is always zero. The reason is the following: µ is defined in the context of the first law of
thermodynamics as the change in energy associated with the change in particle number
dE = T dS − P dV + µdN.
As N adjusts to its equilibrium value, we expect that the system will be stationary with respect
to small changes in N . More rigorously, the Helmholtz free energy F = E − T S is minimized
(dF/dN = 0) in equilibrium for a system at constant temperature (dT = 0) and volume (dV = 0).
Taking the derivative of the Helmholtz energy, we obtain
dF = dE − T dS − SdT,
which, combined with eq. (552), yields
dF = T dS − P dV + µdN − T dS − SdT = −P dV − SdT + µdN
dF
dV
dT
=⇒
= −P
−S
+ µ = µ = 0.
dN
dN
dN
4
Download