Massive neutrinos and cosmology

advertisement
Massive neutrinos
and cosmology
Master thesis by
Jostein Riiser Kristiansen
Institute of Theoretical Astrophysics
University of Oslo
Norway
May 2006
Acknowledgments
First of all I want to thank my supervisor, Øystein Elgarøy. Firstly, for introducing
me to such a new and dynamic field as neutrino cosmology. Secondly, for always
being positively minded to my questions and for never giving me the feeling of
being a burden. I am also grateful for the opportunity I got to go to the Erice
school of nuclear physics in Sicily in September 2005. Thank you.
On the way I have encountered numerous problems, especially related to the
numerical codes that I have been using. In this regard I want to thank David F. Mota
for being extremely helpful with problems related to modification of CMBEASY.
I also want to thank Hans Kristian Eriksen and Frode K. Hansen for the help
provided when I have encountered problems with the MPI implementation of CosmoMC, and Mateusz Røstad for teaching me about Markov chain Monte Carlo
methods. Thank you.
Two important sources of inspiration from the Department of Physics have
been Prof. Finn Ravndal and Prof. Øyvind Grøn. The lectures that they have given
and their friendly attitude to questions have contributed heavily to make theoretical
physics and cosmology an interesting field of study. Thank you.
Thanks to the people in and around Fysikkforeningen and Fysisk fagutvalg over
the last years for making the long days on campus a lot more joyful. Especially I
want to thank my fellow cosmologists Øystein and Gorm. Also thanks to Anders
for long discussions on everything from definite integrals to the meaning of life.
To Nicolaas and Nicolay for helping me out with C-programs. To Henning, Marte,
Josefine, Glenn and the other people in the study hall for making the days at Astro
happier. Thank you.
Finally I want to thank Johannes, Petter and my family for caring and for making me think about other things than physics. Thank you :-)
iii
Contents
1 Introduction
1
2 Physics of the neutrino mass
2.1 Neutrino masses in electro-weak theory . . . . . . . . . . .
2.1.1 Dirac vs Majorana masses . . . . . . . . . . . . . .
2.2 Neutrino oscillations . . . . . . . . . . . . . . . . . . . . .
2.2.1 Experimental evidence and parameter bounds . . . .
2.2.2 Summary of neutrino oscillations . . . . . . . . . .
2.3 Neutrino mass schemes . . . . . . . . . . . . . . . . . . . .
2.4 Determination of absolute neutrino masses . . . . . . . . . .
2.4.1 Tritium beta decay . . . . . . . . . . . . . . . . . .
2.4.2 Neutrinoless double beta decay . . . . . . . . . . .
2.4.3 Cosmology . . . . . . . . . . . . . . . . . . . . . .
2.5 How to give the neutrinos their masses . . . . . . . . . . . .
2.5.1 The seesaw mechanism for generating neutrino mass
2.5.2 Other ways to generate neutrino mass . . . . . . . .
2.5.3 Conclusions on mass generating mechanisms . . . .
3 Cosmology
3.1 Notation . . . . . . . . . . . . . . . . . . . .
3.2 Einstein’s field equations . . . . . . . . . . .
3.2.1 Gµν and its constituents . . . . . . .
3.2.2 The energy-momentum tensor Tµν . .
3.3 The Friedmann equations . . . . . . . . . . .
3.4 The first 300 000 years or so . . . . . . . . .
3.4.1 Inflation . . . . . . . . . . . . . . . .
3.4.2 Neutrinos in the early universe . . . .
3.4.3 Formation of CMB . . . . . . . . . .
3.5 Cosmological observables . . . . . . . . . .
3.5.1 CMB measurements . . . . . . . . .
3.5.2 Large scale structure surveys . . . . .
3.5.3 Some other cosmological observables
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
4
5
7
10
11
11
11
12
13
15
15
22
22
.
.
.
.
.
.
.
.
.
.
.
.
.
25
25
26
26
28
30
32
32
33
34
35
35
37
38
4
Cosmological perturbation theory
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
4.2 The homogeneous and isotropic background . . . . . . .
4.3 Perturbations to the FRW-metric . . . . . . . . . . . . .
4.3.1 Decomposition of perturbations . . . . . . . . .
4.4 Freedom of gauge choice . . . . . . . . . . . . . . . . .
4.5 Particle distributions and the Boltzmann equations . . . .
4.5.1 The perturbation equations for massive neutrinos
4.6 The perturbed Einstein equations . . . . . . . . . . . . .
4.6.1 The perturbed Einstein tensor . . . . . . . . . .
4.6.2 The perturbed energy-momentum tensor . . . . .
4.6.3 Combining the equations . . . . . . . . . . . . .
4.7 Solutions to the perturbation equations . . . . . . . . . .
4.8 Solutions in a pure ΛCDM model . . . . . . . . . . . .
4.8.1 Jeans scale and radiation domination . . . . . . .
4.8.2 Matter domination . . . . . . . . . . . . . . . .
4.8.3 Λ domination . . . . . . . . . . . . . . . . . . .
4.8.4 Summary . . . . . . . . . . . . . . . . . . . . .
4.9 Massive neutrinos and structure formation . . . . . . . .
4.9.1 Neutrino free streaming . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
41
42
42
44
47
48
52
52
52
53
54
55
56
57
58
58
58
59
5
Cosmological neutrino mass limits
5.1 Massive neutrinos and CMB . . . . . . . . . . . . . . . . . . . .
5.1.1 Reduced CMB observables . . . . . . . . . . . . . . . . .
5.1.2 Analytic considerations on the effect of massive neutrinos
5.1.3 Numerical results from CMB alone . . . . . . . . . . . .
5.2 Cosmology and neutrino mass hierarchies . . . . . . . . . . . . .
5.3 Mass limits including various data sets . . . . . . . . . . . . . . .
5.4 Dark energy with wX 6= −1 . . . . . . . . . . . . . . . . . . . .
5.5 The relation between the 0νββ result and cosmological mass limits
63
63
64
64
73
74
76
81
83
6
Summary and outlook
89
A Some comments on model dependency in cosmology
A.1 Model dependency and indirectness . . . . . . . .
A.2 Problems appearing in cosmology . . . . . . . . .
A.2.1 On the border of becoming an exact science
A.2.2 Feedback when trying to verify a model . .
A.2.3 Self-maintenance of popular models . . . .
A.2.4 Selecting the right model . . . . . . . . . .
B Derivation of
dq
dη
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
93
93
94
94
95
95
96
97
vi
C MCMC and CosmoMC
101
C.1 The likelihood function . . . . . . . . . . . . . . . . . . . . . . . 101
C.2 CosmoMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Bibliography
105
vii
Chapter 1
Introduction
I have been working on neutrino cosmology. “Neutrino cosmology” is in itself a
very peculiar expression. Neutrinos are without comparison the lightest massive
particles we know, while cosmology is the science of the very largest scales we
know. Just the idea that these tiny particles can leave observable imprints in the
evolution of the universe is fascinating, and even more the fact that our cosmological observations can constrain the absolute scale of the neutrino masses with
significantly better accuracy than current experiments in particle physics.
Neutrinos were first postulated by Wolfgang Pauli in 1931 to explain the apparent disappearance of energy in β-decay experiments. There is a famous quote
from Pauli saying
I have committed the cardinal sin of a theorist, I made a prediction
which can never be tested, ever, because this particle is so weakly
interacting that it may never be seen.
However, 25 years later, in 1956, the neutrino was detected for the first time by
Cowan and Reines in a β-decay experiment (for which they were awarded the
1995 Nobel Prize). The mechanism of neutrino flavor oscillation as a method to
detect a possible neutrino mass was first suggested by Bruno Pontecorvo in 1957,
saying that if neutrinos are massive particles, they will oscillate over to other flavor
states with a certain probability. The first detection of neutrino oscillations, and
thus also that neutrinos are massive particles, was done as late as in 1998 with the
Super-Kamiokande neutrino detector observing atmospheric neutrinos. Later the
Super-Kamiokande result has been confirmed by several experiments.
The problem with the oscillation experiments is that they only are sensitive
to the mass difference squared between the different neutrino mass eigenstates.
They do not teach us anything about the absolute mass scale. Other neutrino experiments, like tritium β-decay and neutrinoless double β-decay experiments may
constrain the absolute scale of the neutrino mass, but the limits provided by such
experiments are still poor.
Cosmological observables are at leading order only sensitive to the absolute
scale of the sum of the neutrino masses. Therefore cosmology is an excellent tool
1
2
CHAPTER 1. INTRODUCTION
for exploring this largely unconstrained branch of neutrino physics. Due to the
recent years’ dramatic improvement in observations of both cosmic background
radiation, large scale structures and supernovae, cosmology has turned into an exact
science with a relatively well established standard model. With the continuously
improvement of available cosmological data, the cosmological upper limits on the
sum of the neutrino masses have improved by almost an order of magnitude since
the first good cosmological upper limits were given in 2002. Now the upper limits
are only an order of magnitude larger than the lower limit inferred from oscillation
experiments, which makes a detection likely within a few years. However, these
cosmological limits carry with them lots of uncertainties, both when it comes to the
reliability of the data and the underlying cosmological model. It is therefore crucial
to have good knowledge of the robustness of the cosmological mass limits to the
use of different data sets and to changes in the underlying cosmological model.
These are issues that I will discuss in this thesis.
I will start by introducing some of the physics of massive neutrinos, including
flavor oscillations and mass generating mechanisms. Here I will also summarize
experimental constraints on different aspects of the neutrino mass. In chapter 3 I
will present the basics of the current cosmological standard model and some important observable quantities. In chapter 4 I focus on linear cosmological perturbation theory, and especially the relation between cosmological perturbations and
massive neutrinos. In chapter 5 I will present quantitative results on cosmological
neutrino mass limits. I discuss limits that I have obtained and how these correspond
to limits presented in the literature. Here I also study effects of using different data
sets and the effect of allowing for dark energy in another form than a cosmological
constant. At the end of the chapter I will examine the relation between the cosmological neutrino mass limits and the claimed detection of the effective electron
neutrino mass from the Heidelberg-Moscow experiment. In chapter 6 I conclude
with a short summary and future outlooks.
Chapter 2
Physics of the neutrino mass
The main topic of this thesis is the relation between neutrino masses and cosmological observables. In this chapter I will give a summary of the theoretical background for massive neutrinos, and some of the experimental evidence we have for
such masses. In addition to the constraints provided by experiments, I will mention
a few of the most commonly referred models of generating neutrino mass.
2.1 Neutrino masses in electro-weak theory
In this section I will give a short summary of how neutrino masses appear in
quantum field theory, without going into details. For simplicity of notation I will
first assume a single neutrino species. When considering neutrino oscillations in
the next section I will generalize this notation to allow for more species. This
section is based on the references [1], [2], [3] and [4].
In quantum field theory we may represent the neutrino field by a four component spinor
L ν
,
(2.1)
ν=
νR
where ν L and ν R are 2-spinors. The L and R denote left-handed and right-handed
helicity, respectively. One may write the ν L and ν R spinors as projections of the
full ν field using the projection operators
PL
PR
1
= (1 ± γ5 )
2
(2.2)
For a review of the properties of the γ5 -matrices, see [2]. Now ν L and ν R can be
written as
ν L = P Lν
νR = P Rν
3
(2.3)
CHAPTER 2. PHYSICS OF THE NEUTRINO MASS
4
One of the most useful properties of the γ5 matrices is that γ5 γ5 = 1. Using this,
one finds that
[P L ]2 = 1,
[P R ]2 = 1,
P R P L = P L P R = 0,
P L = 1 − P R.
(2.4)
These properties justifies the use of term "projection" operators. In the case of a
vanishing neutrino mass the projection operators will be true helicity projection
operators, and the ν L and ν R fields will be totally independent of one another.
Allowing for non-zero neutrino masses, P L and P R will only be helicity projection
operators in the limit where the total energy is much larger than the neutrino mass
mν . That is, non-zero neutrino masses imply a coupling between the ν R and ν L
fields. This is easily seen writing out the Dirac equations for the two fields [3]
∂
ν L = −mν ν R
iσ · ∇ − i
∂t
∂
−iσ · ∇ − i
ν R = −mν ν L
(2.5)
∂t
Here the mass on the RHS acts as a coupling between the ν L and ν R fields. Experiments show that only the left-handed neutrino field takes part in weak interactions
in nature. So when assuming massless neutrinos one may safely neglect the righthanded neutrinos when for instance counting degrees of freedom in a given model.
But as we see from (2.5), when the mass is non-zero, the right-handed neutrinos
will enter the model, and things will be a bit more complicated. Luckily, this effect
is very small, and for the relevant weak interaction rates the corrections will be of
order (mνl /ml )2 [2] (where l denotes a lepton flavor), which should be less than
∼ 10−12 . Also, when counting degrees of freedom in cosmology the corrections
will be extremely small, since the neutrinos decoupled from the baryon-photon
plasma while still being highly relativistic. The smallness of these corrections will,
in addition to facilitate calculations in cosmology, contribute to the difficulties in
finding good limits on the neutrino mass in accelerator experiments.
2.1.1 Dirac vs Majorana masses
The type of mass terms that you usually see in the Lagrangians in electro-weak
theory are on the form mψ̄ψ. This is called a Dirac mass term. For a charged
fermion like the electron this is the only possible form. However, for electrical
neutral fermions like the neutrinos there is another possibility called a Majorana
mass term which is on the form mψ T C −1 ψ. Here C is a charge conjugation matrix. This explains why this kind of mass term is impossible for electrons; it would
violate conservation of electric charge. The condition for being a Majorana particle
is that the four-spinor is self-charge conjugate, that is ν = C ν̄ T . This means that
a Majorana particle is its own antiparticle. A problem with the Majorana theory is
conservation of lepton number. The neutrinos and their electrically charged counterparts (e, µ, τ ) are given the same lepton number (and the antiparticles the opposite number), and the lepton number is often assumed to be a conserved quantity.
2.2. NEUTRINO OSCILLATIONS
5
But of course, if one allows for Majorana neutrinos, the neutrino will be its own
antiparticle, and lepton number conservation would have to be violated in for example β-decay (n → p + e− + ν̄e ), since any assignment of a lepton number to a
Majorana neutrino would be meaningless because it is its dual nature.
2.2 Neutrino oscillations
The only clear evidence for a non-zero neutrino mass that has been found so far, is
the existence of neutrino oscillations. By neutrino oscillations we mean that there
is a non-zero probability that a neutrino will change its flavor. As an example, an
electron neutrino produced at the sun may be observed as a νµ in an earth based
detector. As will be shown in this section, such oscillations may only occur if
at least two of the neutrinos have different mass. Thus a detection of neutrino
oscillations shows us that at least one of the neutrinos has a non-zero mass. The
following discussion of neutrino oscillations is mainly based on the books [1], [4]
and [2], and the papers [5] and [6].
Still holding on to the one-flavor scenario, and assuming Dirac neutrinos, the
Lagrangian neutrino mass term will look like
Lmν = −mν (ν̄ L ν R + ν̄ R ν L )
(2.6)
The ν̄ L ν L and ν̄ R ν R terms vanish due to the properties of the projection operators
given in (2.4).
Allowing for more neutrino flavors the mass term will look like
Lmν = −ν̄ L · M · ν R + ν̄ R · MT · ν L
(2.7)
where M is a 3 × 3 Hermitian mass matrix. The neutrino field ν is now given by


νe
ν =  νµ 
(2.8)
ντ
If this mass matrix is diagonal, the mass eigenstates will be the same as the
flavor (or weak) eigenstates, which would be simple, nice and a bit boring. But
there is no principle telling us that this has to be the case, and we may allow for
a different set of mass eigenstates and flavor eigenstates. To get an idea of what
is going on, we now assume that only two of the neutrinos, say e and µ, will mix.
This can be written as
ν1
νe
cos θ
sin θ
(2.9)
=
− sin θ cos θ
ν2
νµ
For a three-neutrino scenario we would need a 3 × 3 mixing matrix and two more
mixing angles.
6
CHAPTER 2. PHYSICS OF THE NEUTRINO MASS
So what is the difference between flavor and mass eigenstates? The flavor
eigenstates are the eigenstates that will take part in interactions like β-decay or the
fusion processes in the core of the sun. But when a neutrino is produced in such
an interaction, the mass eigenstates will determine how the neutrino propagates
in time until it for example reaches an earth based detector where, again, a flavor
eigenstate will be detected. We can see from the mixing matrix in (2.9) that in for
example the case θ = 0, the mixing matrix will be diagonal, and one specific mass
eigenstate will correspond to one specific flavor eigenstate, and we will observe
no oscillations. As another example, if θ = π/4 (perfect mixing) oscillations may
occur.
The time propagation of a neutrino state is given by
νi (t) = νi (0) e−iEi t/~
(2.10)
where the subscript i runs over the different mass eigenstates. So when a flavor
eigenstate neutrino is produced, it will propagate as a linear combination of the
different mass eigenstates. A pure νe beam will change to a superposition of a νe
and a νµ beam, become a pure νµ beam (in the case of perfect mixing), and then
oscillate back to a pure νe beam.
Already at this point it is clear that if we have two different sources of neutrino
beams with a known initial flavor at two different distances, and if these distances
are of the same order of magnitude as the oscillation length, it should in principle be possible to determine the different mixing angles and mass differences by
measuring the fraction of neutrinos that have changed their flavor when reaching
our detector. This is not a trivial task to do, especially since neutrinos are so hard
to detect in large quantities, but good attempts have been made, and some good
results have been obtained. I will come back to these results later in this chapter.
I will now assume that we start with a νe state, and derive the probability for
oscillation to a νµ state. From (2.9) and (2.10) we see that the state is given by
ψ(t) = ν1 (0) cos θ e−iE1 t/~ + ν2 (0) sin θ e−iE2 t/~.
(2.11)
To find the oscillation probability, we first find the matrix element for oscillation
by projecting this state down on the the νµ state,
cos θ e−iE1 t/~
hνµ (0)|ψ(t)i = (− sin θ , cos θ)
sin θ e−iE2 t/~
(2.12)
= sin θ cos θ −e−iE1 t/~ + e−iE2 t/~ .
The probability for oscillation is now given by
P (νe → νµ ) = |hνµ (0)|ψ(t)i|2
i
h
= sin2 θ cos2 θ −e−iE1 t/~ + e−iE2 t −eiE1 t + eiE2 t
= 2 sin2 θ cos2 θ {1 − cos[(E1 − E2 )t/~]}
1
sin2 (2θ) {1 − cos[(E1 − E2 )t/~]} .
=
2
(2.13)
2.2. NEUTRINO OSCILLATIONS
7
This expression seems reasonable. We see that the largest amplitude is given for
perfect mixing (θ = π/4), and that the probability for oscillation vanishes as θ →
0.
As already mentioned a very interesting quantity when it comes to determine the mixing angles is the typical oscillation length scale, which is just the
wavelength, L, from (2.13) given by
2πc~
(2.14)
∆E
where ∆E = E1 − E2 . Since almost all neutrinos detected can be expected to be
ultra-relativistic [4], the energy can be expanded to first order in mass as
p
m2 c4
E = p2 c4 + m2 c4 ≈ pc +
.
(2.15)
2E
Momentum is conserved during the oscillations, and we have that ∆E only is
2 c4
sensitive to ∆m2 ≡ m21 − m22 , such that ∆E = ∆m
2E . Using this in (2.14), the
oscillation wavelength is given by
L=
L=
4πE~
E/MeV
4πcE~
= 3
≈ 2.48 m
.
2
4
2
∆m c
c ∆m
∆m2 /(eV)2
(2.16)
So of which order of magnitude is this L in a typical experiment? The typical
energy of a detected neutrino depends on the type of detector used, but is usually
of order 1MeV. And, anticipating some results, ∆m2 is of order ∼ 10−3 eV2 or
∼ 10−5 eV2 . Using (2.16) one finds that the typical oscillation wavelength for
a detected neutrino is ∼ 105 m or ∼ 103 m, that is, much less than the distance
between the sun and the earth, but comparable to the height of the atmosphere of
the earth.
In a 3-neutrino scenario the mixing matrix is often parametrized like


c13 c12
c13 s12
s13 e−iδ
c13 s23  (2.17)
U =  −c23 s12 − s13 s23 c12 eiδ c23 c12 − s13 s23 s12 eiδ
iδ
iδ
s23 s12 − s13 c23 c12 e
−s23 c12 − s13 c23 s12 e
c13 c23
Here I have used a notation where sij = sin θij and cij = cos θij . The δ corresponds is a CP-violating phase1 which is of great theoretical interest. Notice that
every term containing this δ is proportional s13 . This means that our ability to
observe this δ requires that θ13 is not too small.
2.2.1 Experimental evidence and parameter bounds
Atmospheric neutrinos
Atmospheric neutrinos are created by decays of particles (π and K mesons) in
the upper atmosphere, about 10 − 30 km above the surface of the earth. Such
1
CP is a proposed physical symmetry where one assume that a combination of Charge (C) conjugation symmetry and parity (P) symmetry is conserved. CP-symmetry has been shown to be violated
in a few cases.
8
CHAPTER 2. PHYSICS OF THE NEUTRINO MASS
reactions will produce both νµ and νe . Some of the best data we have on atmospheric neutrinos are provided by the Kamiokande experiment and its successor,
Super-Kamiokande, in Japan. It consists of a large under-ground tank filled with
50.000 tons of water surrounded by photon multipliers. A high-energetic neutrino interacting with an electron or nucleus in the water will produce Cherenkov
radiation characteristic for each type of interaction. This can be used for determining which flavor of neutrino that took part in the interaction, and from which
ν
direction it came. The interesting thing is that the ratio νµe shows a strong zenith
angle dependence. The distribution of νe turns out to be very isotropic, while there
are much less νµ neutrinos coming from the "backside" of the earth than from
the atmosphere above the observatory. So, it seems like the νµ s disappear more
the farther they travel. The simplest interpretation of this, is that it is caused by
neutrino oscillations, and that νµ oscillates into another flavor with a much larger
probability than an oscillation from a νe . Results from Super-K gives a preferred
value of ∆matm = ∆m32 ≈ 2.6 × 10−3 eV2 . Here ∆m32 denotes the mass difference between the mass eigenstates 3 and 2. This result is consistent with the results presented in March 2006 from the MINOS experiment detecting oscillations of
neutrinos from Fermi-Lab, where they reported a best fit of ∆m32 ≈ 3.1×10−3 eV2
with completely different systematics [7].
Neutrinos from nuclear reactors
A good earth-based source for neutrinos are nuclear reactors which produce lots of
ν̄e neutrinos. In early experiments sensitive to ν̄e (CHOOZ and Palo Verde) one has
been looking for oscillations of ν̄e neutrinos, without finding any signal [8, 9]. Here
they used short base-lines of ∼ 100m. Later, detectors with a longer distance to
the detectors have been designed, and the first reactor-based experiment pointing
towards neutrino oscillations is the ongoing KamLAND experiment [10], which
has observed less ν̄e than expected in a non-oscillation scenario. Combined with
data from solar neutrino experiments, the KamLAND results have given constraints
on ∆m212 and the corresponding mixing angle.
An interesting thing is that these results from reactor neutrinos are not compatible with interpreting the disappearance of atmospheric νµ as a νµ → νe oscillation. Then, the easiest interpretation of the atmospheric neutrino oscillations is a
νµ → ντ scenario.
Another possibility is an oscillation of the form νµ → νs , where νs is a sterile
neutrino, that is, an additional neutrino which doesn’t take directly part in weak
interactions. The existence of such sterile neutrinos is suggested by many grand
unified theory (GUT) models (see Table 2.2) for mass generation like the seesaw
mechanism (more about the seesaw mechanism later). Here the sterile neutrino
will be a heavy right-handed Majorana neutrino. It is also possible to construct
models with light sterile neutrinos (see for example [11]). But this sterile neutrino
oscillation scenario does not seem to fit the data from nuclear reactors very well.
The best fits for the data to our model, including the recent MINOS result,
2.2. NEUTRINO OSCILLATIONS
9
Figure 2.1: The typical energies for neutrinos from different decay processes in the
sun compared to the energies detected by the different types of detectors. Figure
from [13].
give us a ∆m2atm = 2.5 × 10−3 eV2 and sin2 2θatm = 1.00 [12] (close to maximal
mixing). This solution is called the large mixing angle (LMA) solution, and it is at
present strongly favored compared to an alternative called the small mixing angle
(SMA) solution.
Solar neutrinos
Since the mid-1960s and until the end of the century, the so-called “solar neutrino
problem” was an unsolved puzzle in physics. The problem consisted in a discrepancy between the expected production of solar νe from the standard solar model,
and the observed νe flux in large earth-based observatories. Only about 1/3 of the
predicted flux was observed. In the standard solar model νe is produced both in the
fusion process from H to He and by decay of 7 Be and 8 B, each with a characteristic
energy. Using detectors based on gallium, chlorine and Cherenkov radiation, the
different energy regions were covered (see Figure 2.1). Without finding the missing 2/3, it also seemed hard to change the solar model in any sensible way to fit the
observed νe flux.
The solution came when they started to use heavy-water, D2 O, in the SNO
(Sudbury Neutrino Observatory) detector instead of ordinary water. This made it
possible to detect also νµ and ντ , in addition to νe through the reactions
νe + d → p + p + e −
νe,µ,τ + d → p + n + νe,µ,τ
νe,µ,τ + e− → νe,µ,τ + e−
CHAPTER 2. PHYSICS OF THE NEUTRINO MASS
10
Now the total ν flux corresponded very well to the expected νe production in
the sun, and both the standard solar model and neutrino oscillation were confirmed.
In addition some excellent new data on the neutrino oscillation parameters were
obtained. It also put very good constraints of the effect of sterile neutrinos. If they
exist, they are hardly mixing with νe,µ,τ .
The detailed analysis is complicated by an effect called the Mikheyev-SmirnovWolfenstein (MSW) effect. In the equations presented in this chapter, we have
assumed neutrino propagation in vacuum. The MSW effect stems from the corrections in the oscillation equations due to the presence of matter. The interior of the
sun, for example, is something that is not vacuum at all. Even if neutrinos have a
mean free path in lead of more than one light year, the MSW effect does indeed
play a role for solar neutrinos, especially for νe , which interacts through charged
currents 2 .
Taking this effect into account the best fit results for νe → νµ,τ when combining the results for solar neutrinos and the KamLAND experiment are [12] ∆m2⊙ =
7.9 × 10−5 eV2 and sin2 2θ⊙ ≈ 0.81 .
2.2.2 Summary of neutrino oscillations
We see that there exist strong evidence for neutrino oscillations from different
and independent experiments. The most important results from these experiments,
when it comes to the impacts of massive neutrinos in cosmology, are
• The positive detection of neutrino oscillations confirms that at least two of
the neutrinos indeed are massive.
• The best fit results for solar and atmospheric neutrinos yield ∆m2⊙ = ∆m221 =
7.9 × 10−5 eV2 and ∆m2atm = ∆m232 = 2.5 × 10−3 , which put a lower
limit for the total neutrino mass. These scales also might give a hint of the
absolute mass scale, and it tells us for which possible absolute masses the
neutrino masses can be considered to be degenerate (mν ≫ ∆mν ).
• For the mixing angles the preferred values are sin2 2θ12 ≈ 0.81, sin2 2θ23 ≈
1.00 and sin2 θ13 < 0.045. The values of these mixing angles are of no
importance for the cosmological mass limits on neutrinos, but in particle
physics they are of great importance for understanding the underlying mechanisms for creating neutrino mass. Especially a better determination of the
smallness of θ13 is considered to be a holy grail for this understanding.
• The experimental results presented here seem to favor a scenario with no additional sterile neutrinos. But it should be mentioned that The Los Alamos
2
In electro-weak theory the weak interactions are transmitted by three different vector bosons; the
electrically neutral Z0 boson, and the electrically charged W+ and W− bosons. The neutral currents
correspond to interaction through the Z 0 boson, while the charged currents correspond to interaction
through the W ± bosons.
2.3. NEUTRINO MASS SCHEMES
11
Liquid Scintillation Detector (LSND) has found indications of a higher ∆m2ν =
0.2 − 2eV2 , which would imply at least one heavy, sterile neutrino. These
results are still controversial, and are being checked by the ongoing MiniBoone experiment.
When it comes to future prospects of neutrino oscillations, one can expect the oscillation parameters to be determined with a greater accuracy than today, especially
for the atmospheric ∆m2ν , using future long-baseline experiments. But the scale
of the ∆m2ν s is believed to be settled by now. If θ13 is not too far from its current
upper limit, it is also assumed to be detected in future neutrino oscillation experiments.
2.3 Neutrino mass schemes
The mass differences obtained experimentally may be ordered in two different
mass schemes. The named normal hierarchy has m3 > m2 > m1 while the inverted hierarchy has m2 > m1 > m3 . See figure 2.2 for an illustration of the two different mass schemes. In the case of heavy neutrinos, m21 ≈ m22 ≈ m23 ≫ ∆m2atm ,
we say that the neutrino masses are degenerate since in this case the mass differences are vanishingly small compared to the absolute masses. At present we do not
know which of the schemes that is the correct one, although it has been claimed
that the inverted hierarchy is disfavored by observations of neutrinos from supernova 1987A [14]. One hopes that new close supernovas in the future will give
more information of the mass hierarchy. It is also possible that we will be able
to distinguish the different mass schemes by cosmological observations, although
that would require significantly better observations than we have today [15, 16],
2.4 Determination of absolute neutrino masses
While neutrino oscillation experiments have provided us with relatively reliable
data on the mass square differences, there is still a long way to go to obtain the
same precision when it comes to the absolute neutrino mass scale. Here I will
give a summary of some of the most common and promising methods to determine
this mass scale. One of these methods is the use of cosmological observations.
Although most of the rest of this thesis is concerning cosmological methods, I
will, for completeness, also mention it here. Unless other references are given, this
section is based on [6], [5], [1], [12] and [17].
2.4.1 Tritium beta decay
Tritium is a radioactive isotope of hydrogen decaying as
3
1T
→32 He + e− + ν̄e ,
CHAPTER 2. PHYSICS OF THE NEUTRINO MASS
12
m
m
m2
m3
∆m2⊙
∆m2atm
m1
∆m2atm
m2
∆m2⊙
m3
m1
Normal hierarchy
Inverted hierarchy
Figure 2.2: The two possible neutrino mass hierarchies, the normal and inverted
hierarchy.
a reaction which produces 18.6 keV of energy. How much of this energy that can
be carried away by the electron, depends on the mass mν̄e 3 . By measuring the
endpoint of the electron energy spectrum, one gets an indication of the absolute
mass scale of the neutrinos, since the possible energy carried away by the electrons
depends on the energy bound in the mass of ν̄e . At present, the best limits using
this method are provided by the currently running Mainz and Troitsk experiments
[1], giving upper limits of mνe < 2.2eV and mνe < 2.5 respectively. But a new
experiment, KATRIN, that will start taking data in 2007 is expected to obtain an
upper limit as low as ∼ 0.2eV.
2.4.2 Neutrinoless double beta decay
Neutrinoless double β-decay (0νββ) is a field that has been given a lot of attention
the last years, not only because of its prospects to pin down the absolute neutrino mass with high precision, but also because a positive detection of this process
would imply that the neutrinos are of Majorana nature.
The usual double β-decay (2νββ) is a very rare second-order process where
two neutrons in a nucleus decay simultaneously:
−
(Z,A) → (Z+2,A) + e−
1 + e2 + ν̄e1 + ν̄e2 .
3
By mνe I mean the weighted sum of the mass states comprising νe , that is m2νe = Σi |Uei |2 m2i
where Uij is the three-dimensional mixing matrix.
2.4. DETERMINATION OF ABSOLUTE NEUTRINO MASSES
13
If the neutrinos possess Majorana mass, there is also a slight possibility for a 0νββ
reaction of the form
−
(Z,A) → (Z+2,A) + e−
1 + e2
where the conservation of lepton number is violated. This may happen since in
the Majorana case there is a mass-dependent probability that one of the neutrinos
produced is right handed and can be absorbed by a neutron producing a new protonelectron pair. The mass dependence of this reaction enters the expression for the
2
half-life as a hmmνee i term, and the half life for mνe = 1eV and Eν = 1MeV
becomes τ0ν = 3 × 1024 years. Needless to say, the possible effect is small and
hard to detect. The observable neutrino mass-dependence is
hmνe i = |
X
i
2
Uei
mi |.
(2.18)
A positive detection would give a good indication of the total neutrino mass, at
least if the correct mass scheme is known.
Actually, evidence for a positive detection of 0νββ is claimed to be found in
the Heidelberg-Moscow experiment where a part of the group claims positive results favoring a neutrino mass hmνe i = (0.2 − 0.6)eV (99.73% CL) with a best-fit
value of hmνe i ≈ 0.4eV [18, 19]. The calculation of the involved nuclear matrix
elements are however uncertain, and imposing a 50% uncertainty in these matrix
elements the limit reduces to hmνe i = (0.1 − 0.9)eV. Here they are observing
enriched Germanium, 76 Ge, that is undergoing a double β-decay into 76 Se. The
results from this experiment are still heavily debated because of the large background noise and small statistics provided by the experiment. The statistical techniques applied have also been criticized. Another problem with mass estimates
from 0νββ is that the theoretical matrix elements involved in the process vary in
different papers. New and improved 0νββ experiments are proposed, and most
promising are probably GERDA and MAJORANA, also based on decay of 76 Ge.
In GERDA a sensitivity of hmνe i ∼ 0.050eV is assumed to be reached around year
2010. A prospective confirmation of the Heidelberg-Moscow result will probably
be reached within 2008.
The possible confirmation of the Heidelberg-Moscow results is undoubtedly
one of the more exciting things happening in neutrino physics today. Not only
could it give a good indication of the neutrino mass scale (and not just an upper
limit), but since a positive detection would imply that neutrinos indeed are Majorana particles, it would be of great theoretical interest.
2.4.3 Cosmology
Since the physics of neutrinos in cosmology will be treated a lot more thoroughly
later in this thesis, I will just scratch the surface here. For further summaries of the
state of neutrino cosmology today, see e.g. [20], [21], [22], [23] or [24].
CHAPTER 2. PHYSICS OF THE NEUTRINO MASS
14
After photons, neutrinos are the most abundant (known) particle in the Universe, and the number density of neutrinos is known from the well-understood
physics of the early universe. The impact of neutrinos on cosmological observables is mainly due to suppression of structure growth on scales smaller than the
mass dependent free-streaming scale of neutrinos. Massive neutrinos also affect
other cosmological observables like the cosmic expansion history and the Cosmic
Microwave Background radiation (CMB), which depend on when the neutrinos became non-relativistic, which again depends on their mass. Cosmology is at leading
order
P only sensitive to the absolute mass scale of the sum of the neutrino masses
i mi ≡ Mν . This makes cosmology an important probe for neutrino masses,
since the oscillation experiments only measure mass differences.
Since the number of neutrino species affects the number of degrees of freedom
in the early universe, it will also affect the temperature at which neutrons and protons fall out of equilibrium. This will in turn affect the Big Bang Nucleosyntesis
(BBN) by altering the rate of neutrons to protons by the time of the weak interaction freeze-out. By measuring this rate, combined with other data, one has put a
limit on the number of neutrino species which is 1.7 ≤ Nν ≤ 3.0 [25].
The last years have provided us with new and improved data on both CMB (by
the Wilkinson Microwave Anisotropy Probe (WMAP)) and Large Scale Structure
(LSS) by 2dFGRS and the Sloan Digital Sky Survey (SDSS). At the same time
other cosmological parameters have been pinned down to a greater accuracy by
for example improved statistics on Supernova type 1a (SN1a) data. Combinations
of these data sets have given upper limits on the total neutrino mass from Mν <
0.17eV − 2.0eV depending on which data that has been used and the priors on the
other cosmological parameters. In [26] they even found an upper limit of Mν =
2.0eV from WMAP data only. Some of the results from cosmology are listed in
Table 2.1 4 .
Reference
[27]
[28]
[26]
[29]
[30]
[31]
[32]
[33]
[34]
Year
2002
2003
2004
2004
2004
2004
2005
2006
2006
Upper limit on Mν (eV)
2.2
1.0
2.0
1.7
0.60
0.75
0.42
0.30
0.17
Data used
2dFGRS,BBN, Sn1a
WMAP, 2dFGRS, BBN, SN1a
WMAP
WMAP, SDSS
WMAP, 2dF, SDSS, Sn1a
WMAP, 2dF, SDSS
WMAP, SDSS, Ly-α
WMAP, SDSS, Ly-α, Sn1a, BAO
WMAP, misc. CMB, SDSS, 2dF,
Ly-α5 , Sn1a, BAO
Table 2.1: Various upper limits (95% C.L.) on Mν from cosmological data.
4
The nature of the data referred to in this table will be explained in more detail later.
2.5. HOW TO GIVE THE NEUTRINOS THEIR MASSES
15
Although cosmology provides us with really good mass limits on the neutrinos
compared to the other experiments referred to here, it should be mentioned that
using cosmological observations to constrain the neutrino mass is a very indirect
way of measuring it, and therefore also very model dependent. For example, if,
for some strange reason, the Big Bang model should turn out to be wrong, all of
these mass constraints will be worthless. The standard universe model today is the
ΛCDM model dominated today by dark energy in form of a cosmological constant
with an equation of state P = wX ρ = −ρ. If for example wX differs slightly from
−1, the neutrino mass constraints would be weakened [35], so one always has to
interpret cosmological data with some extra care. Se Appendix A for comments on
model dependency in cosmology.
2.5 How to give the neutrinos their masses
In the standard model (SM) of particle physics the neutrinos are massless. So
to find a way to have massive neutrinos, we have to go beyond the SM. One of
the big questions about the neutrino masses is why they are so small compared to
for example the charged leptons. A proposed mass generating mechanism should
therefore in addition to just create mass, also give a natural explanation for the
small value of the mass. The most popular model today is the seesaw mechanism.
2.5.1 The seesaw mechanism for generating neutrino mass
The seesaw mechanism is partly an inspired-by-string-theory-model (ISTM)6 , but
is also inspired by grand unified theories (GUT). See Table 2.2 for notes on GUT
and string theory. The short review of the seesaw mechanism given here is based
on the references [36], [37], [5], [6] and [1].
The string inspired part of the seesaw mechanism lies in the fact that it may be
derived from a SO(10) model [5]. Luckily this is not the only reason for why this
5
In [34] they utilize a tight constraint on the amplitude of the power spectrum from Ly-α which
is not used in [33]. Also notice that there seem to be some inconsistency between the WMAP and
Ly-α data used in this analysis.
6
An ISTM is a method that often involves physical/mathematical techniques that are non-standard
for the relevant field of application. One often introduces an ISTM to explain why a quantity is what
it is due to some underlying mechanism that is supposed to be more fundamental. A typical ISTM
introduces some extra dimensions, or at least some extra free parameters. Often these parameters in
the end have to be fine-tuned themselves to fit the observed quantities that they were supposed to explain, assuming that the fine-tuned parameters one day will fall out of the fundamental parameters of
string theory. At present string theory, although extremely exciting, has not predicted much that has
been tested, so it must still be considered no more than some promising and interesting speculation.
That a method is inspired by speculation is in itself not enough to make it interesting. So what one
should demand from an ISTM for it to be interesting, is that it at least has less free parameters than
the number of parameters that it is trying to explain. But then again, a theory where this requirement
is fulfilled, is interesting regardless of its source of inspiration. If an artist paints a masterpiece, saying that he/she was inspired by God, it would still be a masterpiece even if it one day turns out that
God does not exist.
16
CHAPTER 2. PHYSICS OF THE NEUTRINO MASS
mechanism is so popular. The first motivation for introducing the seesaw mechanism might have been an attempt to connect two different peculiarities about the
neutrinos, namely the fact that the neutrino mass is so small compared to the other
leptons, and the fact that it carries no electric charge.
The simplest extension of the SM allowing for massive neutrinos is the introduction of a right-handed SU (2) neutrino singlet. Doing this one may have a Dirac
mass-term on the form
mD ν̄L νR
(2.19)
But, taking into consideration the smallness of the observed neutrino mass, this is
not an appealing form of a mass term, since it gives us no reason to believe that
the neutrino masses should be so much smaller than the mass of for example the
electrically charged leptons. At this point we exploit the extra freedom we have
since the neutrinos do not carry any electrical charge; we introduce an additional
Majorana mass term for right handed neutrinos on the form
ν̄R M R (νR )C .
(2.20)
This is the basic idea of the seesaw mechanism. Introduction of a Majorana mass
term will, as mentioned earlier, lead to a violation of lepton number, but will not
violate any of the underlying gauge symmetries of the SM. Now, the full 6 × 6
neutrino mass matrix will take the form
0 mD
(2.21)
M=
mTD MR
Since the SM does not put any constraints on the size of MR , it is supposed to be
large relative to mD and might be comparable to a hypothetic unification scale like
the GUT scale. Diagonalizing this mass matrix we get one heavy eigenstate NR
and a light eigenstate mν given by
NR ≃ MR
m2D
mν ≃
MR
(2.22)
The Dirac mass matrix mD is proportional to the scale where the breaking of
SU (2) × U (1) takes places, that is the vacuum expectation value of the Higgs
doublet v ∼ 300GeV. The neutrino mass is still unknown, but if, say, mν ∼
10−2 eV, we find that MR ∼ 1013 GeV which is approaching the assumed GUT
scale of E ∼ 1015 GeV. These are rough estimates, but at least this gives us a clue
about the scales that might be involved. So by imposing the Majorana mass term
in combination with the Dirac mass term, the small neutrino mass seem to fall out
quite naturally.
As mentioned earlier, an interesting and unsettled problem about neutrino masses
is the form of the mass spectrum. It turns out that one may easily obtain both hierarchical and degenerate neutrino mass spectra using the seesaw mechanism. From
2.5. HOW TO GIVE THE NEUTRINOS THEIR MASSES
17
(2.22) we get no clue about the relations between the different masses. To get this,
we will use an effective mass matrix for the left-handed neutrinos from (2.21) given
by [36]
(2.23)
meff = −mD MR−1 mTD
Example with degenerate mass scheme
Having hierarchical eigenvalues for mD and MR may nonetheless give a degenerate mass spectrum for meff . We start by assuming a mD on the form
 iǫ

e 1 1
1
mD = λ  1 eiǫ1 1 
(2.24)
1
1 eiǫ2
Here λ sets the scale of mD , and we should therefore have λ ∼ v. ǫi are supposed
to be real parameters satisfying |ǫ1 | ≪ |ǫ2 | ≪ 1. So we see that mD can be written
as a small perturbation on the democratic matrix, ∆, scaled with λ, where ∆ simply
is a matrix where all the elements equals 1, that is ∆ij ≡ 1. A 3 × 3 democratic
matrix will have eigenvalues (0, 0, 3). In addition, given any matrix Z, we have


X
Zij  ∆
∆Z∆=
(2.25)
i,j
Now we may write mD as
mD = λ(∆ + ǫ1 A + ǫ2 B) ≡ λmD0
(2.26)
where
A =
B =
(eiǫ1 − 1)
diag(1, 1, 0)
ǫ1
(eiǫ2 − 1)
diag(0, 0, 1)
ǫ2
(2.27)
Since ǫ1 and ǫ2 are small quantities, it is clear that A and B are of order 1. Remembering the constraints on ǫ1 and ǫ2 , one sees from (2.26) that mD will have
hierarchical eigenvalues with one eigenvalue ∼ 3λ and two much smaller (but not
equal) eigenvalues7 .
We now introduce a new matrix, W , given by

 2πi/3
e
1
1
1
(2.28)
W =√  1
e2πi/3
1 .
3
2πi/3
1
1
e
7
Of course, this is not very remarkable, since this mD was designed to give a hierarchical eigenvalue spectrum.
18
CHAPTER 2. PHYSICS OF THE NEUTRINO MASS
GUT
The three fundamental forces in the SM of particle physics (the electromagnetic force, the weak force and the strong force) each have their characteristic
coupling constants. Experiments show that these coupling constants are not
really constants, but that they tend to converge at larger energies. A simple
extrapolation of this behavior suggests that the three coupling constants will
unify at an energy EGUT ∼ 1015 GeV and that beyond this GUT-energy the
three forces will be described by the same grand unification theory. The different fundamental forces that we observe are thus only different low-energy
manifestations of the more fundamental GUT.
String theory
In addition to the three forces in the SM of particle physics we have gravity
as a fourth force in nature. Motivated by the hope that the nature at its most
fundamental is very symmetric and elegant, many people believe that there exists one theory unifying all of the four forces, and that this will happen around
the Planck energy (ETOE ∼ 1018 GeV). Such a theory is called a theory of
everything (TOE). One hot candidate for a TOE is string theory.
In string theory the elementary elements are tiny, vibrating one-dimensional
strings rather than point particles, and these strings are supposed to live in a
multidimensional (often 10 dimensional) space. Much effort is put into the
task of making falsifiable predictions from string theory, but because of its
complex mathematical structure and the extremely high energy at which the
effects will manifest themselves, such predictions have been hard to make.
Another popular physical theory is that of supersymmertry (SUSY) where each
fundamental fermion has a “supersymmetric” bosonic partner and vice versa
(again based on the hope that nature at its most fundamental is simple, symmetric and beautiful). SUSY particles will be looked for at the LHC accelerator at
CERN, and the results may give some more insight into string theory and give
us hints on whether string theory is a fruitful path to follow.
Table 2.2: GUT and string theory
2.5. HOW TO GIVE THE NEUTRINOS THEIR MASSES
19
This W has the property
(W ∗ )−1 = W
(2.29)
We now require MR to be on the form
MR = µ mD0 W ∗ mD0
= µ (∆ + ǫ1 A + ǫ2 B) · W ∗ · (∆ + ǫ1 A + ǫ2 B)
= 3eπi/6 ∆ + ǫ1 A′ + ǫ2 B ′
(2.30)
Getting from the second to the last line, we have made use of (2.25) and used the
smallness of ǫi to omit the terms to second order in ǫi . Because A and B are
required to be of order 1, that is also the case with A′ and B ′ . We see that our
required form for MR leads to a hierarchical structure also for this quantity.
Using (2.23) and (2.29), meff is now given by
λ2
mD0 (mD0 W ∗ mD0 )−1 mD0
µ
λ2
= − W
µ
meff = −
(2.31)
Since W has a degenerate eigenvalue spectrum, that will be the case also for meff .
So by having hierarchical structure on both mD and MR we can still have a degenerate mass spectrum from meff , which is an interesting observation. Anyway, we
had to use the specific form (2.30) on MR to obtain this result, so this result does
not mean that the seesaw mechanism favors a degenerate mass scheme, only that it
it a possible solution.
In [36] they work out a concrete example with a specific form on both the
charged lepton mass matrix and the effective neutrino mass matrix with three
free parameters in each. They rely on the mass constraints from the HeidelbergMoscow experiment of |mee | ≈ 0.36eV, and see how this kind of degenerate mass
spectrum fits the data for the mixing angles and ∆m2 s from neutrino oscillation
experiments, assuming a degenerate mass spectrum. They find their model to be
compatible with the LMA solution (commented on page 9) favored by the oscillation experiments, but their seesaw-model fits even better a SMA solution.
Example with hierarchical neutrino masses
As shown in the previous section, it is possible to obtain a degenerate neutrino
mass spectrum using the seesaw mechanism. In this section we will, following
the reasoning in [36] and [37], see that the seesaw mechanism may also produce a
hierarchical mass spectrum.
The reasoning and techniques used here will be very similar to the ones presented in the last section when treating a degenerate neutrino mass spectrum. Again
the neutrino mass is generated through an extension of the SM, introducing righthanded Majorana neutrinos. All mass matrices are assumed to be proportional to
CHAPTER 2. PHYSICS OF THE NEUTRINO MASS
20
the democratic matrix to the first order, and the perturbations to the democratic
matrices are assumed to be on the same form for both the charged leptons, Dirac
neutrinos and right-handed Majorana neutrinos. Why this assumption? In addition
to making it possible to do some analytical considerations when perturbing around
∆, this assumption is inspired by QCD, where analogous techniques are applied to
describe the quark mass hierarchy and mixing angles in a very successful way.
The assumptions stated in the last paragraph can be written as
Mi = ci (∆ + Pi )


0 0 0
Pi =  0 ai 0 
0 0 bi
(2.32)
where i runs over the lepton, Dirac neutrino and Majorana neutrino sector, i =
l, D, R. This is the same form of M that gave the hierarchical mass spectra for
mD and MR in the previous section. Using (2.23) and (2.32) we can write the
effective mass matrix for the observable left-handed neutrinos as
c2D
(∆ + PD ) (∆ + PR )−1 (∆ + PD )T
cR
c2
0
(2.33)
≡ − D Meff
cR
P
0 are ∆(∆ + P )−1 ∆ = ( (∆ + P )−1 )∆ = ∆
The only surviving terms of Meff
R
R ij
(the elements in (∆ + PR )−1 turn out to cancel out) and PD (∆ + PR )−1 PD . If we
now define
a2
b2
x≡ D
,
y≡ D
aR
bR
Meff = −
0 may be written
the last of the remaining terms in Meff


0 0 0
PD (∆ + PR )−1 PD =  0 x 0  ≡ Peff
0 0 y
(2.34)
So we are left with
0
Meff
= ∆ + Peff ,
(2.35)
which can be shown by performing basic matrix algebra, or faster, by using an
analytical math tool like Maple. We already notice that if |x| ≪ |y| ≪ 1, the
effective mass spectrum will be hierarchical (like for mD in (2.24)). If Peff is of
the same order or dominates over ∆, it is convenient to define


0 0 0
0
Meff
= M ′ =  0 δ 0  + ε∆
(2.36)
y
0 0 1
2.5. HOW TO GIVE THE NEUTRINOS THEIR MASSES
21
where ε ≡ y1 and δ ≡ xy . If we write the diagonalization of this M ′ as M =
F · M ′ · F T , this M may be written as [36]

δ
2
M = G ·  − 2δ
0
where

− 2δ
0
√
δ
2ε  · GT
+ 2ε
2√
2ε
1

1
0
 0 − √1
G=
√ 3
0 √23
0
√
√2
3
√1
3



(2.37)
(2.38)
Just from (2.36) and (2.37) we now see that if |ε| and δ are ≪ 1, we will have
hierarchical eigenvalues (two small and one large). To be more explicit, in [36]
they give the approximated eigenvalues of M to be
p
1
2
2
1 , (δ + 2ε ± δ + 4ε )
2
(2.39)
So by the assumption of democracy to the leading order of the fundamental mass
matrices, an assumption that has proved to be successful in the quark sector, a
hierarchical mass scheme falls out quite naturally, without the need of “fine-tuning”
of the form of MR as had to be done in the example where a degenerate mass
scheme was obtained.
More accurate numerical solutions in [37] and [36], using the experimental
priors on ∆m2⊙ , ∆m2atm and the preferred values of the mixing angles to determine
the three free parameters in this seesaw model, they obtain the following best-fit
values for the experimentally preferred LMA scenario:
∆m212 = 5.36 × 10−5 eV2
∆m232 = 3.94 × 10−3 eV2
sin2 2θ12 = 0.95
sin2 2θ23 = 0.95
MR1 = 3.1 × 106 GeV
MR2 = 1.3 × 1016 GeV
(2.40)
The first four of these quantities are not too far from being compatible with the
observational results presented earlier in this chapter, while the two last ones have
never been observed. The MR -scale, although impossible to test experimentally,
may be deduced in a supersymmetric framework from future experiments [6]. It
is worth noticing that, although using the same techniques, the large mixing angle
solutions obtained here (and in observations), do not occur in the quark sector
(where there are only small mixing angles).
22
CHAPTER 2. PHYSICS OF THE NEUTRINO MASS
What does the seesaw mechanism tell us?
The seesaw mechanism undoubtedly gives a relatively simple and appealing explanation to the small neutrino mass. But do these examples mean that the seesaw
mechanism favors a hierarchical mass spectrum? Not really. If one abandons the
assumption of all fundamental mass matrices being close to democratic, one may
use the the seesaw framework to find a degenerate mass spectrum (as done here),
or an inverted hierarchical mass spectrum. How these different mass schemes fall
out of different choices of parameters is shown in for example [38].
So, does the seesaw mechanism predict anything at all? The different ways to
use the seesaw mechanism produces in addition to limits on the present neutrino
observables, also predictions on the heavy MR Majorana masses. So if they can
be constrained by for example detection of supersymmetric particles at LHC at
CERN, one may rule out some of the seesaw models. This could also be done
simply by tightening the bounds on the other neutrino parameters, like the mixing
angles.
2.5.2 Other ways to generate neutrino mass
Even if the seesaw mechanism today is the by far most popular way to generate
the small neutrino masses, there are of course also other proposals that should be
mentioned (see [6] and [1]).
One of the mechanisms is based on loop diagrams at SUSY scale, where the
self-energy in the loops may generate neutrino mass in SUSY theories. This mechanism can again be divided into the Zee-model that uses one-loops, and the Babu
model that uses two-loops. At present these models are not very predictive, but
more information about SUSY processes, for example from LHC, may give some
constraints on the neutrino masses also in these models.
Another way to explain the small neutrino masses is in universe models with
large extra dimensions, if one allows the right-handed neutrinos to propagate in
the bulk outside our brane, which would make the coupling to the left-handed
neutrinos weak. Assuming that this model is correct, knowledge of the neutrino
masses would give us useful information about the size and physics of the extra
dimensions. This model also allows for neutrinos being Dirac particles, and will
therefore probably get more attention if the neutrinos turn out not to be of Majorana nature. This can be shown for example by a positive detection of neutrino
mass in KATRIN which is not accompanied by a corresponding result from 0νββ
experiments, since the latter process is only allowed in a Majorana scenario.
2.5.3 Conclusions on mass generating mechanisms
At present, it looks like the models for generating neutrino mass are not predictive enough to provide us with much information, more than a few hints, in the
search for the absolute neutrino masses. The models contain too many unknown
parameters. Maybe more knowledge of SUSY-physics from LHC will constrain
2.5. HOW TO GIVE THE NEUTRINOS THEIR MASSES
23
the models more. But it looks like knowledge about neutrino mass will provide
us with knowledge about physics beyond the standard model, and not the opposite
way.
So if we are able to pin down the absolute neutrino masses or the mass scheme
from for example cosmological observations, this knowledge could give us some
useful information about the physics beyond the standard model. This means that
doing neutrino cosmology is something very important, which of course feels very
good to know.
Chapter 3
Cosmology
In this chapter I will based on Einstein’s field equations shortly outline the theoretical background for our cosmological standard model. I will also present some of
the most important cosmological observables.
3.1 Notation
I will start with defining some standard notation commonly used when working
with general relativity and cosmology.
When working with Einstein’s theory of general relativity (GR) there are indices everywhere. Greek letters will always run over the four values 0, 1, 2, 3,
while Latin letters will run over the three values 1, 2, 3. When for example a vector is indexed, a 0-component will denote a time-component, while the 1-, 2- and
3-components will denote the three spatial directions. An example:
xµ = x0 + xi = (ct, x)
(3.1)
Mostly I will be using natural units with c = 1 in the analytical derivations. Also
I will stick to Einstein’s famous summing convention: Repeated indices imply a
sum over all possible values of the repeated index:
X
aµ bµ =
aµ bµ .
(3.2)
µ
I will also use the Kronecker delta defined by:
1 if i = j
δij =
0 if i 6= j
(3.3)
For the metric I will use the (+ − − −) convention. A comma will denote a partial
derivative (when writing math, not in the text, of course):
A,µ ≡ ∂µ A ≡
25
∂
A.
∂xµ
(3.4)
CHAPTER 3. COSMOLOGY
26
A dot will mean a derivative with respect to cosmic time,
Ȧ ≡
d
A,
dt
(3.5)
while a prime will denote derivation with respect to conformal time η (to be defined
later)
d
A′ ≡
A.
(3.6)
dη
Rising and lowering of indices in 4-vectors is done by the operations
Aµ = gµν Aν and Aµ = gµν Aν ,
(3.7)
where gµν is the metric (to be further defined in the next section).
A subscript 0 on a cosmological parameter denotes evaluation today, e.g.
t0 = ttoday .
(3.8)
3.2 Einstein’s field equations
Most modern cosmology is based on Einstein’s field equations. I will simply state
these equations and the names of the quantities appearing in the equations before
briefly explaining the physical interpretation of them. Einstein’s field equations
can be written as:
Gµν = 8πGTµν
(3.9)
Here G is Newton’s gravitational constant. Tµν is the energy-momentum tensor
describing the distribution of energy and pressure in 4-space. The Gµν on the left
hand side is the Einstein tensor, which is a complicated function depending on the
metric and its derivatives.
3.2.1 Gµν and its constituents
The left hand side of (3.9) is a purely geometrical quantity, while the right hand side
is describing some physical content of the spacetime. This is illustrating exactly
what was one of Einstein’s basic ideas: The contents of the spacetime determine
the shape of the spacetime and vice versa.
A metric is simply a thing that translate coordinates in a coordinate system into
physical distances. In a Cartesian coordinate system the differential of the physical
distance squared, ds2 , will be given by
ds2 = dx2 + dy 2 + dz 2 .
3.2. EINSTEIN’S FIELD EQUATIONS
This can be written in matrix notation as



1 0 0
dx
ds2 = (dx, dy, dz)  0 1 0   dy  ,
0 0 1
dz
27
(3.10)
where the matrix in the middle expresses the metric, which is often denote by
gµν . In general relativity we are working with 4-vectors, and with the given signconvention a flat metric (Minkowski metric) will look like


1 0
0
0
 0 −1 0
0 

ηµν ≡ gµν |flat space = 
(3.11)
 0 0 −1 0 
0 0
0 −1
and the line element will be
ds2 = dt2 − dx2 − dy 2 − dz 2 = gµν dxµ dxν .
(3.12)
Here I have used units where c = 1. In a curved and/or dynamic spacetime the
metric will of course be less trivial.
The Einstein tensor from (3.9) is defined as
1
Gµν ≡ Rµν − gµν R
2
(3.13)
where Rµν is the Ricci tensor, often expressed like1
Rµν = Γα µν,α − Γα µα,ν + Γα βα Γβ µν − Γα βν Γβ να .
(3.14)
R is called the Ricci scalar and is just a contraction of the Ricci tensor:
R ≡ gµν Rµν
(3.15)
We see that to compute the Ricci scalar we need to know all the components of the
Ricci tensor. So you will also need all the components of the Ricci tensor if you
just want to calculate one component of the Einstein tensor.
The Γs are connection coefficients called Christoffel symbols. The reason for
having a tensor equation in the first place is that tensor equations are covariant, e.g.
they do not depend on the choice of coordinate system. If you differentiate a tensor
field the resulting field will in general not transform as a tensor2 , and you will lose
your beautiful covariance. To avoid this problem Elvin Christoffel introduced a
covariant derivative denoted Aµ ;ν defined to be a differentiation that conserves
the tensorial nature of the differentiated field. With this definition the Christoffel
symbol are given by
Aµ ;ν ≡ Aµ ,ν + Aα Γµ αν
(3.16)
1
It may also be defined with different contractions.
2
A tensor transforms like Aµ =
′
′
∂xµ
∂xµ
Aµ
CHAPTER 3. COSMOLOGY
28
where Γµαν is chosen so that they conserve a tensorial nature of Aµ;ν . In coordinate basis3 (which we will be using all the time), the Christoffel symbols can be
computed through the much more straight-forward expression
1
Γµ αβ = gµν [∂gαν,β + ∂gβν,α − gαβ,ν ] .
2
(3.17)
Notice that the Christoffel symbols are symmetric in the lower indices, which contribute to simplify many calculations in GR.
Now, if we have a metric, we can compute the Christoffel symbols, then the
components of the Ricci tensor, contract it to get the Ricci scalar, and then we
can find our Einstein tensor. The only problem is that the metric depends on the
contents of your spacetime, and that brings us over to the right hand side of (3.9).
3.2.2 The energy-momentum tensor Tµν
As already mentioned, while the left hand side of (3.9) describes the curvature
of spacetime, the right hand side contains the mass/energy that is filling up the
spacetime. This stuff/energy is conveniently described by the symmetric energymomentum tensor


T00 T01 T02 T03
 T10 T11 T12 T13 

Tµν = 
(3.18)
 T20 T21 T22 T23 
T30 T31 T32 T33
where the different components have the physical interpretations
T00 :
Ti0 :
T0i :
Tii :
Tij :
energy density
momentum density
energy flux
pressure
shear forces or viscosity
Using this energy-momentum tensor, the common physical assumptions of energy
and momentum conservation can be expressed like T;νµν = 0. In cosmology one
often assumes that the contents of the universe can be described as perfect fluids.
To justify such a fluid description we have to assume that
• the temperature and entropy of the fluid is uniquely defined.
• no shear forces (viscosity) are present, since such forces will produce heat in
the presence of currents.
• the particles in the fluid are frequently interacting to maintain hydrodynamical equilibrium. This is not always the case for cosmological fluids. For
3
In coordinate basis the unit vectors are defined by eµ =
time.
∂r
,
∂xµ
where r is a curve in our space-
3.2. EINSTEIN’S FIELD EQUATIONS
29
example are cosmic neutrinos not interacting much today. Despite of this,
the fluid approximation can often be used successfully if the particles have
formerly been in hydrodynamical equilibrium provided that the phase space
distribution of the particles is not much altered after decoupling.
Using a fluid description, the continuity equation can be expressed as T;ν0ν = 0.
When one also assumes that the fluid is perfect (that is, no viscosity), one is only
left with the diagonal components of Tµν . The energy-momentum tensor for a
perfect fluid can be written as
Tµν = (ρ + p)uµ uν + P gµν
(3.19)
where ρ is the energy density and P is the pressure of the perfect fluid. uµ represents the 4-velocity of the fluid (uµ = xµ;0 ). But as we are free to choose a comoving
basis, the components of uµ can be reduced to uµ = (c, 0, 0, 0) = (1, 0, 0, 0). Then
the energy-momentum tensor is given by
Tµν

ρ 0
 0 P
=
 0 0
0 0
0
0
P
0

0
0 

0 
P
(3.20)
which looks quite nice and simple. Given a fluid with known density, we now only
need to find a relation between ρ and P for that fluid, and the right hand side of
the Einstein equations is determined. This relation between ρ and P is commonly
expressed as an equation of state
P = wρ
(3.21)
where w is called the equation of state parameter. To determine the equation of
state it is common to consider three different types of cosmic fluids:
• Dust or non-relativistic matter, which has no pressure and thus wdust = 0.
• Radiation or ultra-relativistic matter for which we require a traceless energymomentum tensor, Tµµ = 0, and thus wradiation = 13 .
• Vacuum energy. It is common to assume the physical properties of vacuum
to be Lorentz-invariant, that is, it is not possible to measure any velocity
relative to vacuum. From this it follows that Tµν ∝ gµν , which means that
wX = −1 4 .
4
Formally this Lorentz invariant vacuum energy is the same as Einstein’s famous cosmological
constant which he introduced as a term Λgµν in his equations to allow for static universe models.
Lorentz invariant vacuum energy with wX = −1 is still commonly referred to as a cosmological
constant.
CHAPTER 3. COSMOLOGY
30
3.3 The Friedmann equations
The Einstein equations consist of ten coupled differential equations. One cannot
just solve them for the universe, even with a perfect knowledge of initial conditions.
To get simple analytical results, drastic simplifications have to be made.
The Friedmann equations are a set of simple and beautiful differential equations, that despite their simplicity have shown to give a very good zeroth-order
description of the evolution of the universe. The basic assumptions behind the
Friedmann equations are the following:
• Homogeneity. For comoving observers the universe looks the same everywhere in space when observed at the same comoving time.
• Isotropy. For a comoving observer the universe looks the same in every
direction.
The above assumptions are connected in the way that isotropy in every spatial point
implies homogeneity. Evidently, the assumptions are wrong on the scales that we
consider in everyday life, and both the above assumptions can be falsified just by
the existence of non-trivial structures like coffee machines and galaxies. Yet, the
cosmological scales are a lot larger than both coffee machines and galaxies, and
spatial homogeneity and isotropy has turned out to be a good zeroth-order approximation of the universe on large scales. Observations indicate that the universe is
spatially flat, and I will from now on only consider flat universe models.
Note that we have not made any assumptions on the temporal behavior of the
universe, such that we still allow for the universe to evolve in time as long as it
evolves in the same way everywhere. Our assumptions implies that the whole
geometry of the universe can be described by the metric


1
0
0
0
 0 −a2 (t)

0
0
.
gµν = 
(3.22)
2
 0

0
−a (t)
0
0
0
0
−a2 (t)
This is called the Friedmann-Robertson-Walker (FRW) metric. The factor a(t) is
called the scale factor and is simply telling us how the spatial distance between
two comoving observers in our universe is evolving with time. It is common to set
a(t0 ) = 1 today. This means that when a(t) was equal to 0.1 all distances where a
tenth of their values today, and a(0) = 0 corresponds to a Big Bang at t = 0.
There is a simple relation between the redshift of light, z ≡ λ0 −λλemitted , due to
expansion, and the scale factor at the time of emission, given by
z + 1 = a−1 .
(3.23)
Inserting the FRW-metric into the Einstein equations (3.9) will give us the
Friedmann equations for flat space:
H 2 (t) =
8πG
ȧ2 (t)
=
ρ
2
a (t)
3
(3.24)
3.3. THE FRIEDMANN EQUATIONS
4πG
ä
=−
(ρ + 3P )
a
3
31
(3.25)
Here H(t) is the Hubble parameter defined as H(t) = aȧ . Inserting for different
equations of state and solving these equations can be a lot of fun, but I will not
use a lot of spacetime doing that here. But one important property is going to
be emphasized. In (3.25) we see that a negative right hand side will correspond
to a universe that undergoes deceleration, while a positive sign corresponds to an
accelerating universe. From the right hand side one then easily sees that
• w<-1/3 gives an accelerated expansion of the universe.
• w>-1/3 will make the expansion of the universe slow down.
Energy components with w < −1/3 are in general referred to as dark energy. I
will denote the dark energy equation of state parameter as wX .
The ρ in (3.24) is usually written as a sum of different components, for example
radiation, dust and vacuum energy. Of course these densities are not in general
constant as the universe evolves, and must be given a time dependence. I will not
discuss the detailed derivations here, but if we tag the densities and time today with
a subscript 0 and set a0 = a(t0 ) = 1 we will find that
• ρdust (t) = ρm (t) = ρm0 a−3 (t). This seems logical. The energy density will
just scale as the number density of “dust particles”.
• ρradiation = ρr = ρr0 a−4 (t). This is also logical. The number density of
photons scales as a−3 , but at the same time the energy of each photon scales
like a−1 as the wavelengths are enlarged (redshifted) as they follow the expansion of the rest of the universe.
• ρvacuum = ρΛ = ρΛ0 . If one consider the vacuum energy as just a property
of the vacuum itself, it seems logical that the vacuum energy density stays
constant.
From these scaling properties it is clear that an expanding, flat universe consisting
of radiation, non-relativistic matter and a non-zero Lorentz-invariant vacuum energy will undergo first an epoch of radiation domination, perhaps an intermediate
epoch of matter domination (if the matter density is sufficiently large) and finally
end up as more and more vacuum dominated and expand faster and faster forever.
Using the Friedmann equations it is also straightforward to show that the time
evolution of the scale factor can be written as
(
2
3(1+w)
if w 6= −1
t√
(3.26)
a(t) ∝
Ct
e
if w = −1
where C is an integration constant. In the simplified picture where we assumed
that the universe is dominated by one energy component at a time, we then have:
CHAPTER 3. COSMOLOGY
32
• am ∝ t2/3
• ar ∝ t1/2
√
• aΛ ∝ e
Ct
Inspired by the form of the first Friedmann equation (3.24) it is common to
define the critical density today by
ρcr0 =
3H02
.
8πG
(3.27)
It is also common to define another parameter, the fractional energy density, Ω, by
P
X
ρi
ρ
Ω=
Ωi =
= i
(3.28)
ρcr0
ρcr0
i
where i runs over over all energy components. Using this we can rewrite equation
(3.24) as
X
Ωi0 = Ωm0 + Ωr0 + ΩΛ0 = 1.
(3.29)
i
The 0-subscript on the Ωs is often omitted, and evaluation today is made implicit.
The Hubble parameter today, H0 is also commonly written as
H0 = 100h km s−1 Mpc−1
(3.30)
where the dimensionless Hubble parameter h ≈ 0.7 today (see e.g. [39]). Using
this definition of h it is also common to use yet another density parameter, ω,
defined by
ωi ≡ Ωi h2 .
(3.31)
3.4 The first 300 000 years or so
I will not give a detailed description of the history of the early universe here. The
important thing for neutrino cosmology is that we had a big bang which produced
some Gaussian initial fluctuations and lot of particles where a certain fraction were
neutrinos. I will shortly comment on the inflation model, the neutrino abundance
and formation of the cosmic microwave background radiation (CMB).
3.4.1 Inflation
According to the inflation model, the universe, at the age of a small fraction of a
second, went through a short epoch of rapid inflation (or exponential growth) where
its size increased > e55 times. The main reason for introducing an inflationary
epoch is that it resolves two major problems with the big bang model:
3.4. THE FIRST 300 000 YEARS OR SO
33
• The flatness problem. Why does the universe appear to be spatially flat?
In an inflationary model initial curvature will be almost erased, making the
universe look almost flat today.
• The isotropy problem. Why does the universe appear to be so isotropic? In
a big bang model without inflation objects that we observe on opposite sides
of the sky should never have been in causal contact. Why does the universe
appear to be isotropic then? Within an inflation model all of the observable universe could have been in thermal equilibrium before the inflationary
epoch. Then causally connected scales were blown outside the Hubble horizon, and now they are re-entering the horizon (I will comment further on the
Hubble horizon in section 4.8).
The most common inflation models involve one or more scalar fields (called inflaton fields) that are rolling down a potential. One of the predictions from a standard single field inflation is that the primordial power spectrum (to be defined later)
should be on the form
P (k) ∝ kns −1
(3.32)
where ns , which is called the scalar spectral index, is predicted to be a bit smaller
than 1. That is, the primordial power spectrum should be almost scale invariant.
The WMAP team has measured ns ≈ 0.95 [39], strengthening the simple single
field inflation model. A rather thorough discussion of the relations between single
field inflation and ns can be found in [40].
A problem with the inflation model is that we do not know what this inflaton
field is. Another problem is that one can produce almost any kind of primordial
spectrum when adding more scalar fields to the inflation model. This makes the
general idea of inflation hard to test.
3.4.2 Neutrinos in the early universe
We now go back to the universe when the temperature has decreased sufficiently for
the quarks to form nucleons but still at a temperature above 1 MeV. The universe
is now so dense and hot that the neutrinos still are in equilibrium with the baryonphoton plasma5 , following a Fermi-Dirac distribution
fν =
1
e(p−µ)/T
+1
.
(3.33)
Here the Boltzmann constant, kB , is set to 1. At this temperature the neutrinos are
still ultra-relativistic, so we can safely use the momentum p instead of the energy
in the distribution function. The chemical potential for the neutrinos is assumed
to be negligible [41]. This assumtion can be verified because µνe affects the n-p
conversion rate in the early universe through reactions like p + e− → n + νe . The
p to n ratio affects the production of light elements in Big Bang nucleosyntesis
5
In cosmology the term “baryon” also includes the charged leptons.
CHAPTER 3. COSMOLOGY
34
(BBN) and the abundance of light elements today, which can be observed. These
observations indicates a very small µνe . Because of neutrino oscillations between
the different flavors in the early universe, this low value of the chemical potential
also applies to νµ and ντ [42].
From this distribution function the number density of neutrinos is given by
integration over momentum space. At a temperature Tν = Tγ ≈ 1MeV the interaction rate for the neutrinos falls below the expansion rate of the universe and
the neutrinos decouple from the rest of the plasma. This is not an instantaneous
process, nor did it happen at the same time for all neutrino species, but the approximation of an instantaneous process turns out to be quite good. Shortly after this
the universe is cold enough for electrons and positrons to annihilate. This leads
to a reheating of the baryon-photon plasma that the neutrinos do not take part in.
Simple considerations from counting degrees of freedom in statistical mechanics
lead to a difference in neutrino and photon temperature given by
Tν =
4
11
1/3
Tγ .
(3.34)
More accurate numerical treatments of the processes around the neutrino decoupling have been done. It turns out that some of the neutrinos gain some extra energy
from the electron-positron annihilation. This can be accounted for just by using an
effective number of neutrinos Neff = 3.04 instead 3 in the cosmological Boltzmann
codes.
Integration over the distribution function yields an average number density of
nν ≈ 113cm−3 which is almost as high as the photon density. Using the knowledge
of the neutrino number density and the fact that they are non-relativistic today, it
is straightforward to show that the neutrino energy density in the universe today is
given by
Mν
(3.35)
ων = Ων h2 =
93.14eV
3.4.3 Formation of CMB
After electron-positron annihilation the photon-baryon plasma was still a tightly
coupled fluid through the frequent interactions between free electrons, nucleons
and photons. At a temperature of Tγ ∼ 3kK it was cold enough for the electrons
to combine with the nucleons to form stable atoms. At this point the photons
stopped interacting with the baryonic plasma, and they started propagating almost
undisturbed through the universe, only being redshifted by the expansion of the
universe. These are the photons that we observe as the CMB radiation today. In
−5
the CMB radiation we observe small temperature fluctuations of order δT
T ∼ 10 .
These fluctuations were the seeds that later grew and formed the structures that
we see in the universe today. The structure of the CMB temperature fluctuations
is maybe the single most important cosmological probe that we have today. More
details on how to quantize and analyze these fluctuations will be given later.
3.5. COSMOLOGICAL OBSERVABLES
35
3.5 Cosmological observables
In the following discussions about perturbation theory and the effect of massive
neutrinos on cosmology, I will mainly focus on the effects on cosmic microwave
background radiation (CMB) and large scale structures (LSS). Here I will shortly
explain the observed quantities in CMB and LSS experiments. I will also briefly
explain some other cosmological observables which I will use in my later analysis.
3.5.1 CMB measurements
The origin of CMB was explained in section 3.4. The main observable quantity
in the CMB radiation is the temperature fluctuations and their angular distribution.
There are also measurements of polarization effects, which are of great importance
for example when it comes to ruling out inflation models, but here I will only comment upon the temperature fluctuations. At some point we want to compare these
temperature fluctuations to predictions from some theory that we want to test. Of
course no theory can predict the exact distribution of the temperature fluctuations
over the sky, but only statistical properties of the distribution. So what we need
is measurements of how large the fluctuations are in average on different angular
scales. I will now outline how these fluctuations are parametrized, following the
procedure in [40].
First we define the temperature fluctuation in a direction x on the sky as
Θ(x, n, η) =
1
δT (x, n, η).
T
(3.36)
Here n is a unit vector in the direction of the momentum of the photon. Since we
are looking for the distribution of fluctuations on different scales, it is convenient
to transform Θ(x, n, η) into Fourier space. We then have
Z
d3 k
Θ(k, n, η)eik·x .
(3.37)
Θ(x, n, η) =
(2π)3
We now introduce a new quantity µ defined by µ ≡ k̂ · n = cos θ, where k̂ is a unit
vector in the direction of k. Using this we can write the solid angle element as
dΩ = 2π sin θdθ = 2πd cos θ = 2πdµ.
(3.38)
We now write Θ out as a sum of Legendre polynomials
Θ(k, µ, η) =
∞
X
(−i)l (2l + 1)Θl (k, η)Pl (µ)
(3.39)
l=0
where Pl is a Legendre polynomial of order l. Using (3.38)and the orthogonality
relation for Legendre polynomials
Z 1
2δll′
dµPl (µ)Pl′ (µ) =
(3.40)
2l + 1
−1
CHAPTER 3. COSMOLOGY
36
we can write
1
4π
Z
dΩPl Pl′ =
δll′
.
2l + 1
(3.41)
Using this orthogonality we find that (3.39) is satisfied if we write a single multipole of Θ as
Z
il
dΩΘ(k, n, η)Pl (µ).
(3.42)
Θl (k, η) =
4π
An important goal for CMB measurements is to find correlations between points
with different angular separations on the sky. If two points are separated by an
angle β on the sky, it is common to define a two-point correlation function by
C(β) = hΘ(x, n, η0 )Θ(x, n′ , η0 )i.
(3.43)
Here cos β = n · n′ . We want to rewrite this correlation function in terms of
Legendre polynomials on the form
C(β) =
∞
1 X
(2l + 1)Cl Pl (cos β).
4π
(3.44)
l=0
We now need to find an expression for this Cl . If we insert the expression for Θ
from (3.37) in (3.43) we have
Z
Z
d3 k′
′
hΘ(k, n, η0 )Θ(k′ , n′ , η0 )iei(k+k )·x
3
(2π)
Z
Z
∞
3
d k
d3 k′ X
′
=
(−i)l+l (2l + 1)(2l′ + 1)hΘl (k, η0 )Θl′ (k′ , η0 )i
3
3
(2π)
(2π) ′
C(β) =
d3 k
(2π)3
l,l =0
′
Pl (k̂ · n)Pl′ (k̂′ · n′ )ei(k+k )·x .
(3.45)
In the last equality I have used the relation from (3.39). We now make the assumption that our primordial temperature fluctuations are Gaussian. This is predicted by
standard inflation models, and it also seems to fit the CMB data rather good. With
this assumption of Gaussianity the different Θl modes are orthogonal, and we have
hΘl (k, η0 )Θl (k′ , η0 )i = (2π)3 δ(k + k′ )δll′ h|Θl (k, η0 )|2 i.
(3.46)
From the Dirac delta function δ(k + k′ ) we see that the integration in (3.45) only
gives contributions when k′ = −k. The summation over the ls will, due to the
Kronecker delta δll′ only give contributions for l = l′ . Using these properties we
can now rewrite the expression from (3.45) in the simplified way
C(β) =
∞ Z
X
l=0
d3 k
(2l + 1)2 h|Θl (k, η0 )|2 iPl (k̂ · n)Pl (k̂ · n′ ).
(2π)3
(3.47)
3.5. COSMOLOGICAL OBSERVABLES
37
The product of the Legendre polynomialsR can now
R be simplified by using the orthogonality relation from (3.41) and that d3 k = dkk2 dΩk . We have
Z
4π
Pl (n · n′ ).
(3.48)
dΩk Pl (k̂ · n)Pl (k̂ · n′ ) =
2l + 1
Using this, the expression for C(β) now reduces to
∞ Z
X
d3 k
C(β) =
(2l + 1)h|Θl (k, η0 )|2 iPl (cos β).
(2π)3
(3.49)
l=0
The point of doing all this was to find an expression for the Cl term in (3.44).
Comparing (3.44) and (3.49) we see that Cl can be written as
Z
d3 k
h|Θl (k, η0 )|2 i.
(3.50)
Cl = 4π
(2π)3
We see that Cl is always a positive quantity, and that a large Cl implies a large
temperature fluctuation on the scale set by l. Measurements of the temperature
fluctuations of CMB is usually given in terms of (2l + l)Cl as a function of l.
Precise measurements of the CMB spectrum is at present the single most constraining set of cosmological data that we have. The best data set from full-sky
survey come from the three year data from the WMAP team [39]. Within a few
years the first data from the Planck satellite will be launched, adding even more
precision to the full-sky CMB spectrum. There are also other CMB experiments
measuring small scale fluctuations, such as ACBAR [43], CBI [44], VSA [45] and
BOOMERanG [46].
3.5.2 Large scale structure surveys
Measurements of the large scale structures (LSS) of the universe is maybe the
second most important kind of data that we have to constrain our cosmological
models 6 . Roughly speaking, what LSS surveys do is to find as many galaxies
as possible and record their angular and redshift distribution. Then one can make
statistics on how much clustering we have on different scales and compare that to
our cosmological models.
The common notation for quantifying LSS differs slightly from that CMB,
since we here are measuring galaxy distributions in both angular and redshift space.
It is common to define a variable δm (k, z) ≡ δρρmm(k,z)
(z) which corresponds to the Θ
variable used for temperature fluctuations. Here ρm is the average matter density,
while δρm denotes the difference from ρm in the local matter density. Again we
are working in Fourier space. The matter power spectrum is then defined as the
Fourier transform of the two-point correlation function
P (k, z) = h|δm (k, z)|2 i.
(3.51)
6
Some people might argue that supernova surveys are of equal importance. But at least from a
neutrino cosmologist’s point of view, measurements of LSS are extremely important.
38
CHAPTER 3. COSMOLOGY
By counting galaxies in cells in k-z space, this correlation function is directly
measurable. Or, that is, what we really are measuring is the galaxy-galaxy power
spectrum h|δg (k, z)|2 i. But most of the matter in our standard ΛCDM universe
model is in the form of invisible dark matter. So we are making theoretical predictions from linear theories on quantities that is dominated by dark matter, and we
have to compare our results to observations on the part of the baryonic matter that
is stuck in galaxies (which are highly non-linear). It is common to define a bias
parameter b by Pg (k) = b2 Pm (k) relating the observed galaxy spectrum and the
total matter power spectrum. Simulations indicate that b should be a constant on
the linear scales that we are usually working on in cosmology, and b can in this
case just be treated as a free variable of normalization.
There are two large sets of galaxy counts today, the 2dF Galaxy Redshift Survey (2dF) and the Sloan Digital Sky Survey (SDSS) (which is the largest one). In
my later analysis I will use data from both these surveys.
3.5.3 Some other cosmological observables
Some other frequently used observables relevant for neutrino cosmology are:
• Supernovae type 1a (Sn1a). This is a frequently used “standard candle” in
cosmology. The time-luminosity function of this type of supernova is believed to be well understood. Therefore, observing such supernovae, one
can correlate the known intrinsic luminosity, the observed luminosity and
the observed redshift and extract important information on the late-time expansion history of the universe. The largest data sets on Sn1a are the Riess
“gold sample” (which I will use), and the Supernova Legacy Survey.
• Hubble Space Telescope (HST) key project. Measurements from HST put
constrains on the Hubble parameter h, which I will use in my later analysis.
• The Lyman alpha forest (Ly-α). Studying emitted light from quasars one
finds Ly-α (1216 Å) absorption lines from intervening gas clouds. The
strength of these lines can be used to estimate the amount of hydrogen in
gas clouds between the quasars and ourselves [4]. The current best dataset
on Ly-α comes from SDSS with about 3.000 quasar spectra [47]. Ly-α data
is potentially very useful for neutrino cosmology since it can measure smallscale fluctuations at relatively high redshifts (z ∼ 2 − 4), where effects of
non-linearities in the matter power spectrum enters at smaller scales than for
z ≈ 0. However there are still some uncertainties concerning the systematic
errors when measuring the matter power spectrum from Ly-α.
• Gravitational lensing. What makes gravitational lensing especially interesting as a cosmological probe is that it is equally sensitive to all kinds of matter,
including dark matter. Gravitational lensing has mostly been used observing
galaxies and quasars. The CMB radiation will also be sensitive to lensing
3.5. COSMOLOGICAL OBSERVABLES
39
effects. Although they are hard to extract from the data they will probably
become an important additional source of cosmological data.
• The observations of the oldest globular clusters indicate that the universe is
at least ∼ 12Gyr old, ruling out universe models which give a lower age of
the universe.
Chapter 4
Cosmological perturbation theory
4.1 Introduction
The study of perfect homogeneous and isotropic universe models is extremely useful for achieving a good understanding and good quantitative estimates for the
overall behavior of the universe. This is due to the well verified assumption of homogeneity and isotropy in the universe on sufficiently large scales1 . However, on
smaller scales the universe is clearly everything but homogeneous and isotropic.
The formation of LSS is attributed to the effect of gravitational collapse of tiny
initial density fluctuations. Density fluctuations can also be seen for example in
the CMB spectrum, but here at a state much closer to the primordial fluctuations.
The density fluctuations can be studied as perturbations on a perfectly homogeneous and isotropic background. Doing this, one will find that the evolution of
density fluctuations will depend heavily on the density and form of the energy that
dominates the universe in different epochs. Thus such studies of perturbations are
essential to our knowledge of the constituents of the universe.
This chapter is mainly based on the references [48], [40], [49] and [50].
4.2 The homogeneous and isotropic background
The simple idea of homogeneity and isotropy is powerful not only in the sense
that it is in very good accordance with the observations of LSS and CMB, but
also because it gives some very pleasant and simple solutions to Einstein’s field
equations which then reduce to the Friedmann equations (3.24) and (3.25).
We now introduce conformal time, η, defined by
a2 (η)dη 2 = dt2 .
(4.1)
1
For a discussion of some problems related to verifying cosmological assumptions and models,
see appendix A
41
42
CHAPTER 4. COSMOLOGICAL PERTURBATION THEORY
The zeroth order background for our perturbations can then be described by the
FRW line element,
ds2 =(0) gµν dxµ dxν = a2 (η)(dη 2 − δij dxi dxj ),
(4.2)
where the zeroth-order FRW-metric, (0) gµν can be written in matrix form as


1 0
0
0
 0 −1 0
0 
(0)

gµν = a2 (η) 
(4.3)
 0 0 −1 0 
0 0
0 −1
4.3 Perturbations to the FRW-metric
The introduction of physical inhomogeneities in the universe can be expressed as
a first order perturbation, δgµν to the FRW-metric. The line element can then be
written as
ds2 = ((0) gµν + δgµν )dxµ dxν
(4.4)
Since the metric is a symmetric tensor, there will be at most 10 degrees of freedom in the perturbations stemming from δgµν . We will now show how these can
be decomposed into scalar, vector and tensor perturbations named after how they
behave under a transformation between two three-dimensional coordinate systems.
The motivation for such a decomposition is firstly that in the perturbed Einstein
equations these three parts will evolve independently of one another, and therefore
can be treated separately. Secondly, they will have different physical interpretations
and behavior, as will be specified below.
4.3.1 Decomposition of perturbations
A useful way to parametrize the metric when doing perturbation theory, is to split
it into a FRW background and an additional perturbation part. The perturbation
can thus be written as
gµν →(0) gµν + δgµν = a2 (ηµν + hµν ).
(4.5)
The perturbation hµν is here assumed to be a first order correction to the background metric, and all higher order terms are neglected. We are free to parametrize
hµν like
2φ Bi
.
(4.6)
hµν =
Bi −Cij
Here φ will have one degree of freedom, Bi will have 3, and Cij will have 6 (being
a symmetric 3 × 3 tensor). Using this parametrization, we may rewrite (4.4) as
ds2 = a2 (1 + 2φ)dη 2 + 2Bi dxi dη − (δij + Cij )dxi dxj .
(4.7)
4.3. PERTURBATIONS TO THE FRW-METRIC
43
φ is a scalar which cannot be splitted up further. Bi and Cij , on the other hand, can
be parametrized into their scalar, vector and tensor constituents. We write Bi as
Bi = −∂i B + Vi
(4.8)
where B is a scalar potential and Vi is a divergence-free vector. This is just the
common decomposition of a vector into a curl-free part (which can be written as the
gradient of a scalar potential) and a divergence-free part. Cij can be parametrized
as
(4.9)
Cij = −2ψδij + ∂i ∂j E + ∂i Ej + ∂j Ei + hij
{z
} |{z}
| {z } | {z } |
1
1
2
2
where the numbers under the braces correspond to the number of degrees of freedom associated with the different terms (which sums up to 6). This parametrization
of a rank 3 tensor is analogue to the more commonly applied splitting of a vector
into a divergence free and a curl free part as shown for the Bi vector above. Here ψ
and E are scalar fields. The term ∂i Ej + ∂j Ei is a divergence free vector field. The
reason why it has two instead of three degrees of freedom is because it is divergence
free, and thus ∂i Ei = 0. The hij term describes the tensor fluctuations. It has only
two degrees of freedom since in this parametrization of Cij it will be constrained
by the so-called TT (transverse traceless) gauge with hii = 0 and ∂i hij = 0.
With this parametrization of gµν , we can now collect the scalar, vector and
tensor parts of our first-order metric as
• Scalar perturbations
hscalar
µν
=
2φ
−∂i B
−∂i B 2ψδij − ∂i ∂j E
(4.10)
• Vector perturbations
hvector
µν
=
0
Vi
Vi −(∂i Ej + ∂j Ei )
• Tensor perturbations
htensor
µν
=
0
0
0 −hij
(4.11)
(4.12)
The scalar perturbations are the only ones that couple to matter in first order
theory, and are also the most important ones that couple to photons. Vector perturbations will give nonzero off-diagonal elements (corresponding to shear forces
in the fluid models). Such modes are included in some cosmological models, but as
they will decay with the expansion of the universe, they can often be omitted. The
two degrees of freedom in the tensor perturbations correspond to two polarization
states of gravitational waves. In first order perturbation theory gravitational waves
do not couple to matter.
44
CHAPTER 4. COSMOLOGICAL PERTURBATION THEORY
Based on the physical arguments concerning the different classes of perturbations given above, and especially since gravitational collapse is the main subject
of this thesis, I will mainly concentrate on scalar perturbations in the rest of this
thesis.
4.4 Freedom of gauge choice
As already mentioned, the perturbations to the metric will have at most 10 degrees
of freedom. Some of these degrees of freedom will turn out to depend only on the
choice of coordinate system and are hence only gauge artifacts and not real physical
degrees of freedom. If we can identify these gauge-dependent fluctuations in the
metric, we can then eliminate some of our apparent degrees of freedom to simplify
our theory, and that is of course a good thing.
Again we make a distinction between an unperturbed background space-time,
M0 , and a physical, perturbed space-time, M. A specific choice of coordinates
can be related to a specific mapping, D, between M0 and M. Another coordinate
choice will map the same point in M0 to a different point in M through a new
mapping D̃.
Now, let us consider a physical quantity, Q, mapped from M0 to M with both
D and D̃. A difference in Q given the two different mappings, must then be a
gauge-dependent artifact of our theory with no physical significance. We call the
corresponding quantity to Q on M0 for Q(0) . At a point x ǫ M the perturbation
δQ to Q with the mapping D is defined by
δQ(x) = Q(x) − Q(0) D −1 (x) ,
(4.13)
and analogously, with the mapping D̃:
δQ̃(x) = Q(x) − Q(0) D̃ −1 (x) .
(4.14)
Since we demand that our theory should be independent of coordinate choice, the
quantity
∆Q(x) = δQ̃(x) − δQ(x)
(4.15)
must be a pure gauge artifact.
To find how coordinate transformations will change our metric, we will consider an infinitesimal coordinate transformation
xµ → x̃µ = xµ + ξ µ .
(4.16)
The time component ξ 0 will lead to scalar perturbations, while the spatial part of
ξ µ can be decomposed into
i
ξ i = ξtr
+ γ ij ξ,j
(4.17)
i is a transverse part
where γ ij is the spatial part of the background metric. ξtr
corresponding to two degrees of freedom connected to the vector perturbations,
4.4. FREEDOM OF GAUGE CHOICE
45
and ξ,j gives scalar perturbations. Here I will only consider the scalar part. Under
such a coordinate transformation the metric will transform as
δgµν → δgµν − ∇µ ξν − ∇ν ξµ
(4.18)
where the ∇s denote covariant derivatives defined by
∇µ ξν = ∂µ ξν − Γλµν ξλ
(4.19)
The contravariants to ξ µ are given by
ξµ = gµν ξ ν = a2 (ξ 0 , ξ i )
(4.20)
where we only have used the background part of the metric. This is valid since
we are working in first order perturbation theory and both the ξs and our scalar
potentials are first order quantities. Using the definition of the Christoffel symbols
given in (3.17) we have the non-vanishing components [40]
Γ000 = H
Γ000 = H
Γ0ij
= Hδji
(4.21)
′
where H = aa = a1 H. Using the expression for the transformation of the metric given in (4.18) we will now parametrize this change in the metric as a transformation of our four scalar potentials φ, B, ψ and E given in (4.10). For the
00-component we have
δg00 = a2 h00 = 2φa2
(4.22)
This component will now transform like
δg00 → δg̃00
=
=
=
≡
2φa2 − 2(∇0 ξ0 )
2φa2 − 2(ξ0′ − Hξ0 )
2φa2 − 2 (a2 ξ 0 )′ − Ha2 ξ 0
2φ̃a2
(4.23)
where the last line corresponds to the definition that the change in the metric should
be parametrized as a change in our potentials. This gives
i
1 h
2
′ 0
2 0′
2 0
2φa
−
4aa
ξ
−
2a
ξ
+
2Ha
ξ
2a2
′
= φ − Hξ 0 − ξ 0
φ̃ =
(4.24)
Similarly, doing the transformations on the other metric components we will find
how all of our four scalar potentials transform. The result is:
46
CHAPTER 4. COSMOLOGICAL PERTURBATION THEORY
′
φ̃ = φ − Hξ 0 − ξ 0
B̃ = B + ξ 0 − ξ ′
Ẽ = E − ξ
ψ̃ = ψ + Hξ 0
(4.25)
It is now easy to verify that the two quantities Φ and Ψ defined by
1 ∂
[(B − E ′ )a]
a ∂η
Ψ = ψ − H(B − E ′ )
Φ = φ+
(4.26)
will be gauge invariant when using the transformation rules given in (4.25).
Now, any physical observable quantity should be gauge invariant, e.g. not
depend on the choice of coordinate transformations. Familiar examples are the
observable electromagnetic fields, E and B, which are independent of the choice
of gauge for the electromagnetic vector potential Aµ . In this picture the analogies
to the gauge invariant E and B fields would be the gauge invariant quantities Φ
and Ψ in (4.26). Such potentials formed the basis for the first gauge invariant
cosmological perturbation theory, formulated by Bardeen [51].
But then again, metric perturbations are not physical observables, so we are
not constrained to only work with such gauge invariant variables. Equally well we
can introduce suitable constraints on the functions in (4.25), to reduce the degrees
of freedom to two in our theory of scalar perturbations. This is called to make a
specific choice of gauge.
Historically, the first gauge choice made in the studies of cosmological fluctuations is called synchronous gauge. Here one set φ = B = 0 in equations
(4.25). The synchronous gauge have some draw-backs, for instance that it does
not uniquely specify the metric perturbations. Thus another gauge choice called
conformal Newtonian gauge is more commonly used in the study of cosmological
perturbations today.
The conformal Newtonian gauge is defined by setting
B=E=0
(4.27)
in (4.25). Here the metric perturbations will be uniquely defined (the ∆Q(x) from
(4.15) vanishes) and we do not get any effect of changing our coordinate transformations, which is exactly what we wanted to have. This uniqueness can now
easily be seen by simply substitute our new constraint into the expression (4.26)
for the gauge invariant functions Ψ and Φ which now will correspond to the only
remaining free functions ψ and φ in the scalar metric perturbations. From now on
I will only be using this gauge in my analytic considerations.
4.5. PARTICLE DISTRIBUTIONS AND THE BOLTZMANN EQUATIONS 47
In conformal Newtonian gauge the scalar metric perturbations from (4.10)
yield
φ
0
2
.
(4.28)
δgµν = 2a (η)
0 ψδij
Using (4.4) we can now write the line element
ds2 = a2 (η) (1 + 2φ)dη 2 − (1 − 2ψ)δij dxi dxj .
(4.29)
So, what does (4.29) tell us? Not too much. Only that this is a possible way to
parametrize the scalar fluctuations in the metric with the correct number of degrees
of freedom. What we are really interested in, is how these perturbations evolve in
time. To find that we have to figure out
• how the different energy components in the universe respond to the perturbations in the metric. Then we have to study the particle distributions given
by the Boltzmann equations.
• how the metric perturbations will respond to the perturbations in the energy
densities and evolution of the universe. To do this we have to study the
Einstein equations (3.9) using the perturbed metric and the perturbed energy
densities.
So the metric changes the energy distributions and the energy distributions
change the metric. Some of the cosmic fluids also interact between themselves,
such as for electrons and photons (at least until last scattering). As we go closer
and closer to the big bang we expect that all the energy components were in equilibrium, and then decoupled one after another, as mentioned earlier. But when
considering growth of perturbations the Compton scattering between photons and
electrons and the Coulomb scattering between electrons and protons should be sufficient to consider, since the other interactions between the energy components
ceased to be efficient before structure started to grow significantly (see e.g. [49]).
The point is that since all the components and the metric are connected, as illustrated in Fig. 4.1, all these equations have to be solved for simultaneously.
4.5 Particle distributions and the Boltzmann equations
The Boltzmann equation is in principle very simple:
dfi (x, p)
= C[f ]
dt
(4.30)
Here fi is the distribution function for a particle species. The C[f ] term describes
all the collision terms which depends on the interaction rates with the other species
present and their distributions. So the complicated physics is hidden in this term.
What one should do now is to solve (4.30) for all the species shown in Fig. 4.1.
To do this properly consumes a lot of spacetime, so I will not do that here. Instead
CHAPTER 4. COSMOLOGICAL PERTURBATION THEORY
48
neutrinos
gµν
dark matter
photons
electrons
protons
Figure 4.1: All the energy components interact with the metric. In perturbation
theory we can omit contributions to the metric perturbation potentials from a cosmological constant, since dark energy with wX = −1 does not cluster. We also
have interactions between photons and electrons (Compton scattering) and electrons and protons (Coulomb scattering). This means that the perturbed Einstein
equations and the Boltzmann equations for all species have to be solved for simultaneously.
I will only do it in detail for the massive neutrino component. The corresponding
derivations for the other species follow roughly the same steps, but of course with
some differences. Here I will let the massive neutrinos serve as an example for how
the Boltzmann equation can be solved in a first order theory. The main references
used for this derivations are [49] and [24]2 .
4.5.1 The perturbation equations for massive neutrinos
I will only consider the perturbation growth for the neutrinos after decoupling from
the baryon-photon plasma. Thus the right hand side of (4.30) is zero and we have
dfν (x, P )
=0
dt
(4.31)
where x and P both are 4-vectors. P is here a momentum variable. We define the
4-momentum as
dxµ
Pµ ≡
.
(4.32)
dλ
Here λ is just a parameter that is monotonically increasing along a particle’s path.
To remove one degree of freedom we use that the 4-momentum is a conserved
quantity obeying
P 2 ≡ gµν P µ P ν = m2
(4.33)
2
Note that these references are using different definitions for their metrics.
4.5. PARTICLE DISTRIBUTIONS AND THE BOLTZMANN EQUATIONS 49
since we are using comoving coordinates. We now use this constraint to fix the
time component P 0 . In (4.29) we defined our perturbed line element. The corresponding perturbed metric in conformal Newtonian gauge can be written
(1 + 2φ)
0
2
gµν = a (η)
(4.34)
0
−(1 − 2ψ)δij
Using this perturbed metric we write
P 2 = g00 (P 0 )2 + gij P i P j
| {z }
2
≡−p2
0 2
= a (η)(1 + 2φ)(P ) − p2
= m2
⇓
P
0
=
≈
=
p
m2 + p 2
√
a(η) 1 + 2φ
E
(1 − φ)
a(η)
ǫ
(1 − φ)
2
a (η)
(4.35)
(4.36)
p
where I have defined a new energy variable ǫ ≡ a m2 + p2 . The last equality
holds since we are working to first order in perturbation theory.
The variable P i , the canonical conjugate of the comoving coordinate xi , is
linked to the physical momentum observed by a comoving observer, p, by [24, 52]
p
P i = −p̂i (1 + ψ)
a
i q
= −p̂ 2 (1 + ψ),
a
(4.37)
where a new momentum variable, q p
≡ ap, is defined. We also see that using
this new variable q we can write ǫ = a2 m2 + q 2 . The usefulness of these new
variables will soon become obvious, but we already see that q will be a momentumvariable staying constant with the expansion of the universe for a non-interacting
particle.
We now expand the Boltzmann equation (4.31) as
dfν
∂fν
∂fν dxi ∂fν dq ∂fν dp̂i
=
+
+
+ i
= 0,
dη
∂η
∂xi dη
∂q dη
∂ p̂ dη
(4.38)
where p̂i is a unit vector in the direction of p and q. Neutrinos follow a simple
Fermi-Dirac distribution which only depends on the magnitude of the momentum,
i
ν
term must be a first order term. The dp̂
so the ∂f
dη term must also be of first order,
∂ p̂i
since the direction of the momentum only can change in presence of a perturbation
50
CHAPTER 4. COSMOLOGICAL PERTURBATION THEORY
to the homogeneous background. Therefore the last term in (4.38) can be omitted
in the first-order theory.
We now use the relations from (4.35) and (4.37) along with (4.32) and rewrite
i
the dx
dη term in (4.38) as
dxi
dη
dxi dλ
dλ dη
Pi
=
P0
p̂i q(1 + ψ)
= −
ǫ(1 − φ)
p̂i q
(1 + φ + ψ)
(4.39)
≈ −
ǫ
when we keep only the first-order terms. In the Boltzmann equation (4.38) this
∂fν
dxi
dη term is multiplied by ∂xi which is a first order term. This means that in our
first-order theory, the φ and ψ terms in (4.39) will vanish.
dq
The dη
term in (4.38) can be written as
=
dq
= qψ ′ + p̂i ǫφ,i .
dη
(4.40)
The derivation of this equation, being rather long and boring, is left to Appendix
B.
This far I have just assumed that we have some collisionless matter particles.
Now I will use the fact that neutrinos are fermions, and thus that they to the lowest
order will follow the Fermi-Dirac distribution
1
fν(0) (q) = q/aT
(4.41)
ν + 1
e
Here it is easy to see the point of introducing this q variable instead of p. Since Tν
(0)
scales like a−1 , fν as a function of q will be time independent. Strictly speaking,
as neutrinos are massive particles, the energy variable ǫ should have been used
in the distribution function instead of q. But when neutrinos stopped interacting
with the baryon-photon plasma at a temperature of 1MeV they were still ultrarelativistic, so this correction is negligible. After they have stopped interacting, the
neutrino phase space distribution will only be affected by changes in the metric.
These gravitational changes in the neutrino distribution is what we are looking for
and the very reason for doing perturbation theory. To include these fluctuations in
our distribution function I define a small neutrino perturbation N (x, q, p̂i , η) ≪ 1
by
fν = fν(0) (q) [1 + N (x, q, p̂i , η)]
(4.42)
From this equation it follows that
∂fν
= fν(0) N ′
∂η
(4.43)
4.5. PARTICLE DISTRIBUTIONS AND THE BOLTZMANN EQUATIONS 51
and using the result from (4.39) we have that
q p̂i (0)
dxi ∂fν
f N,i ,
=−
dη ∂xi
ǫ ν
(4.44)
and from (4.40) it follows that to first order
(0)
∂fν
dq ∂fν
= qψ ′ + ǫp̂i φ,i
.
dη ∂q
∂q
(4.45)
Using all these nice relations that have been found in the last pages, I now rewrite
the Boltzmann equation for massive neutrinos (4.38) as
(0)
fν(0) N ′ −
∂fν
p̂i q (0)
fν N,i + qψ ′ + ǫp̂i φ,i
ǫ
∂q
= 0.
(4.46)
To make the equation nicer and easier to solve, I now transform it to Fourier space,
using the following definition for the Fourier transformation:
Z
d3 k ik·x
A(x) =
e A(k)
(4.47)
(2π)3
Transforming to k-space gives a much more useful form for our equations when
we want to study the scale dependence of the matter fluctuations, since this makes
the scale dependence explicit. Using this definition of our Fourier transforms our
∂
spatial derivatives transforms as ∂x
i → iki . Transforming to Fourier space and
(0)
dividing by fν , (4.46) now turns into
(0)
q
∂ ln fν
ǫ
′
′
N − i (k · p̂)N = − ψ + i (k · p̂)φ
.
ǫ
q
∂ ln q
(4.48)
This is the equation that we wanted. It shows how the phase space distribution
of massive neutrinos and the metric affect each other. That is, we have found a
mathematical expression for one of the seven arrows in Figure 4.1. But the metric
is of course affected by all the cosmic fluids, and for doing meaningful calculations
one has to find the corresponding equations for photons, baryonic matter and cold
dark matter3 .
If we set q = ǫ in (4.48) we will have the perturbation equation for massless
neutrinos.
So what is the effect of giving neutrinos mass? We obviously get some corrections to our perturbation equations by the qǫ terms. But the equations here only
deals with the perturbations to the metric. What determine the interplay between
geometry and matter are the Einstein equations (3.9), so we will need to translate
the perturbations in the metric to perturbations in the Einstein tensor to get the perturbed left hand side of the Einstein equations. And to find the right hand side of
(3.9) we need to find an expression for the perturbed energy momentum tensor.
3
A cosmological constant is by definition constant, and will not have a perturbed energy density.
If you have a dark energy component which is not a cosmological constant, such as a quintessence
model, you have to find the perturbation equation for this component as well.
CHAPTER 4. COSMOLOGICAL PERTURBATION THEORY
52
4.6 The perturbed Einstein equations
4.6.1 The perturbed Einstein tensor
In section 3.3 I showed how the use of a homogeneous 0-order metric (3.22) leads
to the Friedmann equations whose solutions give the evolution of the homogeneous
background in the universe. However, as we now want to find the perturbed Einstein equations, we have to calculate the Einstein tensor for the perturbed metric
(4.34). Doing this is a lot of work, and people have done it before, so I will not
derive these components of the Einstein tensor here, but just refer to results which
can be found in e.g. refs. [24, 48, 50, 40]. Here I will refer to the perturbed Einstein tensor δGµν as the full Einstein tensor obtained from the metric (4.34) with
the homogeneous part subtracted. The resulting non-zero components are:
2 −3H2 φ − 3Hψ ′ + ∇2 ψ
(4.49)
2
a
2 =
∂i Hφ + ψ ′
(4.50)
2
a 2
a′′
1
= − 2
2 − H2 φ + H(φ′ + 2ψ ′ ) + ψ ′′ + ∇2 (φ − ψ) δji
a
a
3
1
1
+ 2 ∂ i ∂j − ∇2 δji (φ − ψ)
(4.51)
a
3
δG00 =
δG0i
δGij
These equations now have to be combined with a perturbed energy-momentum
tensor δTνµ to find the perturbed Einstein equations
δGµν = 8πGδTνµ .
(4.52)
4.6.2 The perturbed energy-momentum tensor
In section 3.2.2 we saw that the energy-momentum tensor for a perfect fluid can be
written as
Tµν = (ρ + P )uµ uν + P gµν ,
(4.53)
where P now denotes the pressure, and uµ is the 4-velocity. In perturbation theory
we are working with perturbed quantities of the form
ρ → ρ + δρ
P
→ P + δP.
Introducing our perturbed energy density and pressure, it turns out that the first
order perturbed energy momentum tensor takes the form [24, 40, 48, 50]
δT00 = δρ
δTi0
δTji
=
=
||
(ρ + P )vi
i||
−δP δji + Σj .
(4.54)
(4.55)
(4.56)
4.6. THE PERTURBED EINSTEIN EQUATIONS
53
||
So the fluid can in general not be regarded as perfect anymore. Here vi is the
longitudinal component of the 3-velocity field, which also can be written as the
||
i||
gradient vi = ∂i V of a velocity potential V . The Σj is the traceless and longitudinal part of the non-perfect-fluid effects on the energy momentum tensor. In
general this part can be written as Σij , but it is only the traceless and longitudinal
part that accounts for scalar perturbations. The rest of the Σij tensor is only interesting when studying vector and tensor perturbations. Having a scalar potential Σ
we can write [40, 24]
1 i 2
||
i
(4.57)
Σij = ∂ ∂j − δj ∇ Σ.
3
4.6.3 Combining the equations
Now we have expressions for both the perturbed Einstein tensor and the perturbed
energy-momentum tensor. Inserting this into the expression for the perturbed Einstein equation (4.52) we find:
2 −3H2 φ − 3Hψ ′ + ∇2 ψ = 8πGδρ
(4.58)
2
a
2 ||
0i-comp. :
∂i Hφ + ψ ′ = 8πG(ρ + p)vi
(4.59)
2
a a′′
1 2
2
2
′
′
′′
2 − H φ + H(φ + 2ψ ) + ψ + ∇ (φ − ψ) δji
ij-comp. : − 2
a
a
3
1
1
+ 2 ∂ i ∂j − ∇2 δji (φ − ψ)
a
3
00-comp. :
= 8πG(−δpδji + Σij )
(4.60)
||
i||
Following the procedure used in [24] I reexpress the the quantities vi and Σj using
the new quantities
X
θ≡
∂i vi = ∇2 V,
(4.61)
i
which will represent the velocity divergence, and σ, which will represent the anisotropic stress, and is defined by
(ρ + p)∇2 σ ≡ −
X
2
1
||i
(∂i ∂j − ∇2 δij )Σj = − ∇4 Σ.
3
3
(4.62)
i,j
Using the definitions from (4.61) and (4.62), defining a new perturbation variable
δ ≡ δρ
ρ and converting to Fourier space the perturbed Einstein equations can be
CHAPTER 4. COSMOLOGICAL PERTURBATION THEORY
54
written
(4.58) ⇒ −3H2 φ − 3Hψ ′ − k2 ψ = 4πGa2 ρ δ
(4.63)
2
′
2
(4.59) ⇒ −k Hφ + ψ = 4πGa (p + ρ)θ
(4.64)
′′
2
a
k
(4.60) ⇒
2 − H2 φ + H(φ′ + 2ψ ′ ) + ψ ′′ − (φ − ψ)
a
3
= 4πGa2 δp
2
(4.65)
2
(4.60) ⇒ k (φ − ψ) = 12πGa (ρ + p)σ.
(4.66)
In addition the Einstein equations gives us the constraint that the energy-momentum
tensor for each of the cosmic fluid should be conserved. Mathematically this can
be expressed through the continuity equation [24]
δ′ = (1 + w)(θ + 3ψ ′ )
(4.67)
and the Euler equation
θ ′ = H(3w − 1)θ −
w
w′
θ − k2 φ − k2 σ −
k2 δ.
1+w
1+w
(4.68)
These relations will hold separately for each uncoupled energy component.
Now we know how the metric perturbations and the components of the energy
momentum tensor affects each other through the Einstein equations. Earlier we
have found how the metric perturbations and particle distributions affect each other
using the Boltzmann equations (exemplified with massive neutrinos). So given a
set of initial conditions and knowledge of the homogeneous background evolution
from the Friedmann equations, we should now be able so solve the structure growth
of the universe 4 .
The problem is that we have a set of coupled differential equations that cannot
be solved analytically without making some really crude simplifications. These
simplifications can of course be made to get some ideas of qualitative properties of
our equations, but making good quantitative predictions require use of a numerical
Boltzmann codes like CAMB [53]5 .
4.7 Solutions to the perturbation equations
The perturbation equations have now been found. Sensible analytical solutions can
be found in the limits where one assumes that the universe is dominated by one energy component which fully determines the evolution of the metric perturbations.
4
This will of course only apply to scales where non-linear effects are negligible.
This code does however not use conformal Newtonian gauge as I have been doing, but synchronous gauge (see section 4.4). Synchronous gauge is often preferred by people writing numerical
Boltzmann codes, although the physical interpretations are a bit more obscure in this gauge. But
computers are boring beings that don’t care about physical interpretations. A bit like mathematicians,
maybe.
5
4.8. SOLUTIONS IN A PURE ΛCDM MODEL
55
Then the sub-dominant fluids can be regarded as “test fluids” following the given
metric perturbations without affecting them. This can for instance be done deep
into the matter dominated epoch where the the matter fluctuations will determine
the perturbation growth and the different cosmic fluids can be regarded as independent. However, during radiation domination the radiation and baryonic matter
components are coupled through Coulomb interactions, which will complicate the
solutions.
In the following I will not focus on how to obtain these mathematical solutions,
but rather discuss the physics that they imply. Solving the full set of equations will
anyway require extensive use of numerics, and doing sensible analytical approximations is outside the scope of this master thesis. This section is mainly based on
the excellent reference [24]. As done in this reference I will first look at a pure,
flat, neutrinoless ΛCDM universe to clarify the most important effects, and then
add massive neutrinos to see how they alter our observables.
4.8 Solutions in a pure ΛCDM model
Our main observables related to perturbation theory are the CMB and LSS power
spectra, which were both introduced in section 3.5. In the formation of structures
in these observables, there are two different horizons that play a crucial role:
• The particle horizon, which sets the causal scale. Two objects outside each
other’s particle horizon cannot affect each other causally due to the finite
speed of light. That is, a physical effect occurring at a time ti can only affect
objects inside a horizon given by
Z t
dt′
(4.69)
d(ti , t) = a(t)
′
ti a(t )
where c = 1 as usual. The particle horizon can be approximated by the
Hubble length given by
1
(4.70)
RH =
H(t)
which sets the right scales, which is all that we need in a qualitative discussion. Gravitational signals travel with the speed of light, so if you have an
energy overdensity, only particles within a Hubble radius around this overdensity will be subject to gravitational collapse. Anyway, using general relativity we will also be able to calculate super-horizon perturbation effect.
However, the effects are not large and their behavior is gauge-dependent. I
will in the following not comment further on super-horizon perturbations.
• The sound horizon, which sets the scale reachable for pressure waves. Thus,
if we have a sound velocity cs < 1 in our cosmic fluid, the sound horizon is
given by
Z t
cs
cs dt′
≈ .
(4.71)
rs (t) = a(t)
′
H
ti a(t )
56
CHAPTER 4. COSMOLOGICAL PERTURBATION THEORY
The speed of sound in a perfect fluid is given by
c2s =
∂P
= wc = w,
∂ρ
so in a radiation dominated universe cs ≈
(4.72)
√1
3
4.8.1 Jeans scale and radiation domination
Using the first Friedmann equation we see that the sound horizon (4.71) is closely
related to the so called Jeans length6 , λJ , often defined by
p
λJ ≡ 2πcs 2/3
3
8πGρ
1/2
= 2π
a(t)
,
kJ (t)
(4.73)
where kJ is called the Jeans wave number. On scales below this Jeans scale, the
overdensity caused by gravitational collapse will induce a net pressure force in the
opposite direction if we are considering fluids with w > 0. This effect will resist
perturbation growth on scales below λJ , and these modes will instead tend to oscillate. These oscillations are the seeds for the acoustic peaks that we see in the CMB
power spectrum today. So the fluctuations of the dominant part of the energy content, namely radiation, will oscillate on small scales. We are also interested in the
dark matter matter fluctuations, since these will dominate the metric perturbations
in later times during the formation of the matter fluctuations that we observe today
through galaxy surveys. The dark matter particles are not feeling to pressure forces
and can therefore in principle collapse inside λJ . But this clustering will be very
slow (logarithmic in η), since the main contribution to the metric perturbations in
this epoch stems from the oscillating radiation.
This interplay between the oscillations in the photon-baryon plasma and the
cold dark matter particles can be studied today when comparing the CMB and
LSS power spectra. In the CMB power spectrum we have a rather direct observation of the perturbations in the photon-baryon plasma not too long time after
matter-radiation equality. Here the acoustic peaks are huge. In the matter power
spectrum today we can see tiny oscillations stemming from oscillations in the metric on λ < λJ in the early universe. If the matter content of the universe was
only baryonic, these fluctuations in the matter power spectrum would have been
much larger. This effect of acoustic oscillations in the matter power spectrum has
also been detected by SDSS [54]. In Figure 4.2 it is illustrated how the acoustic
oscillations are imprinted in the matter power spectrum.
On scales larger than λJ but smaller than RH there are no pressure forces to
resist gravitational collapse. Solving the perturbation equations gives that perturbations will grow as ∼ a2 on these scales during radiation domination. This, however,
6
Named after the British physicist, mathematician and astronomer Sir James Hopwood Jeans
(1877-1946).
4.8. SOLUTIONS IN A PURE ΛCDM MODEL
57
5
10
ωb=0.027
ω =0.1
b
4
10
P(k) (Mpc/h)
3
3
10
2
10
1
10
0
10
−5
10
−4
10
−3
10
−2
10
k (h/Mpc)
−1
10
0
10
1
10
Figure 4.2: The matter power-spectrum for standard ΛCDM-parameters (solid
line), and a model with more baryons (dash-dotted line). We can see traces of
the baryonic oscillation from in the solid line, and we see that they are much more
pronounced in the dash-dotted line, where some of the dark matter is replaced by
more baryons. The plot is made using CAMB.
only applies to the dominating radiation component, and not to the matter component. Even if the metric perturbations will follow the perturbations in the radiation
component, matter fluctuations will turn out to be constant. This happens because
the rapid expansion of the universe will make clustering of matter more difficult,
and this effect will almost perfectly cancel the effect of gravitational collapse in
the radiation dominated epoch.
From (4.73) we see that λJ ∝ ρ−1/2 which means that in a radiation dominated
universe we have that λJ ∝ a2 . So as the universe evolves, larger and larger scales
will be inside a Jeans length, and thus not be able to cluster further in the linear
regime.
4.8.2 Matter domination
Since the oscillation effect described in the last paragraph requires that the dominant cosmic energy component exerts pressure forces, this effect only applies to the
first, radiation dominated epoch after big bang. In the following dark matter dominated epoch, λJ will be zero, and structures can grow freely on all scales, since
there are no pressure forces associated with the dark matter component.
It turns out that during matter domination perturbations on all scales will grow
as δm ∼ a. This will be complicated a bit by effects such as free streaming, on
which I will comment further when adding neutrinos to the universe.
58
CHAPTER 4. COSMOLOGICAL PERTURBATION THEORY
4.8.3 Λ domination
When considering perturbations, there are mainly three effects of adding a cosmological constant to our universe:
• The increased expansion rate of the universe during Λ-domination will make
the metric perturbations decay and thus slow down the formation of structure.
• The cosmological constant itself does not cluster 7 .
• Assuming a flat universe, a larger value of ΩΛ will be compensated by a
smaller Ωm , and thus provide the universe with less cluster-willing matter.
All these effects are pulling in the same direction: Large ΩΛ means less clustering
of matter.
4.8.4 Summary
It has to be stressed that the above description of perturbation growth is highly
simplified. Especially in the transition phases between the different epochs corrections to such a simplified picture are important. But taking such effects into
account without treating the mathematical equations a lot more thoroughly would
be useless. Anyway, the qualitative picture of how the matter perturbations will
be altered when changing the energy contents of the universe will be as described
above. The effects can be summarized as:
Epoch
Radiation domination
Matter domination
Λ-domination
Evolution of δm
λ < λJ : oscillating and slowly clustering (∼ ln η).
λ > λJ : constant.
Grows as δm ∼ a on all scales (some suppression
on small scales because of free streaming).
Slower growth as we come deeper into Λ-domination.
4.9 Massive neutrinos and structure formation
Since neutrinos are dark but not cold, this corrected universe model will be called
a Λ mixed dark matter (ΛMDM) model in contrast to the good, old ΛCDM model.
There are mainly two effects from the massive neutrinos that affects the CMB
and LSS power spectra. Firstly, they free stream on large scales because of their
small mass, and secondly they will have a mass-dependent effect on the time of
matter-radiation equality, aeq . Here I will mainly focus on how the massive neutrinos affect LSS, since the CMB effects will be treated more thoroughly in the next
chapter.
7
However, using dark energy models with e.g. wX 6= −1, clustering effects of dark energy will
appear.
4.9. MASSIVE NEUTRINOS AND STRUCTURE FORMATION
59
4.9.1 Neutrino free streaming
The free-streaming effect is an effect occurring in all collisionless cosmological
fluids. It stems from the fact that the growth of a small overdensity takes a finite
amount time. Let us say that we have an initial overdensity with a typical length
scale λo , and that it consists of fast-moving, collisionless particles. The overdensity
1
. If the particles
will have a characteristic collapse time given by Tc ∼ √Gρ
m
making the overdensity overdense have large velocities, they may travel a longer
distance than λo in the time interval Tc . Then the overdensity is not an overdensity
anymore, and structure will not grow on this scale.
The velocity of a particle will of course depend on its mass. So the scale on
which this free streaming effect applies also depends on the mass of the particles.
This is an extremely important clue to understand why such small fellows as neutrinos can have observable effects on CMB and (especially) LSS: Neutrinos, with
their peculiar small mass, will have a distinct imprint in the structure formation of
the universe as they suppress structure formation on small scales.
To be a bit more quantitative, the typical free-streaming scale, λF S , can be
found simply by substituting the speed of sound by thermal velocity in the definition of the Jeans length (4.73), and we have
r
a(t)
2 vth (t)
= 2π
(4.74)
λF S = 2π
3 H(t)
kF S
where vth is the thermal velocity of the particle considered and kF S is the comoving free-streaming wave number. The mass dependence is hidden in vth . For
non-relativistic neutrinos the thermal velocities are related to the expectation value
of the momentum by hpν i = mν vth . We then have that
ν
=
vth
hpν i
3Tν
T0
=
=3 ν .
mν
mν
mν a
Inserting Tν0 = 1.946 K and converting to physical units, we find that
eV
ν
5 −1
vth = 1.5 × 10 a
m/s
mν
(4.75)
(4.76)
so if a typical scale of the neutrino mass is ∼ 0.1eV, the neutrinos will have a
ν ∼ 106 m/s today. Using the first Friedmann equation
typical thermal velocity of vth
(3.24) to insert for H(t) in (4.74) we find that for an epoch where Ωr is negligible
we have
r
v (t)
2
p th
λFS (t) = 2π
3 H0 ΩΛ + Ωm a−3
7.7
eV
p
≈
h−1 Mpc.
(4.77)
−3
m
a ΩΛ + Ωm a
CHAPTER 4. COSMOLOGICAL PERTURBATION THEORY
60
Information on LSS is usually given by wavenumber, so it is also useful to have an
expression for kFS which, using (4.74) and (4.77), is given by
m p
ν
kF S (t) = 0.82 a2 ΩΛ + Ωm a−3
hMpc−1 .
(4.78)
eV
To establish some notions about which scales we are talking about, we find that
with mν ∼ 0.1eV today, we have kF S ∼ 0.1hMpc−1 today. This is around
the scale where it is common to assume that non-linear effects become important in the matter clustering. LSS surveys like SDSS probe scales around k ∼
(0.01 − 0.2)hMpc−1 , so effects occurring around the typical free-streaming scale
of a typical massive neutrino should be detectable by such LSS surveys.
From (4.77) we see that λF S will grow during the matter dominated epoch, but
1
∝ t1/3 8 . That means that a comoving free-streaming length
only as λF S ∝ aH
λF S
C
−1/3 . The above relations are
defined by λC
F S ≡ a will decrease like λF S ∝ t
derived for non-relativistic neutrinos. Ultra-relativistic neutrinos will have a freestreaming length corresponding to the Hubble length, and it will grow in the matter
dominated epoch. A neutrino passing through its non-relativistic transition during
the matter dominated epoch will thus experience a maximum value for its λC
F S and
a corresponding minimum knr for its comoving wave number given by [24]
r
m ν
h Mpc−1 .
(4.79)
knr ≈ 0.018 Ωm
eV
This means that perturbations on all scales smaller than knr will to some extent
be suppressed by neutrino free-streaming. If we now set mν ∼ 0.1eV and set
Ωm ∼ 0.25 we find that knr ∼ 3 × 10−3 h Mpc−1 . This scale is larger than what
we can observe in LSS surveys today. But all scales smaller this will of course also
be object to neutrino free-streaming effects which in principle can be observed.
This effect can be seen in Figure 4.3.
We now define a new variable ζ ≡ Ων /Ωm called the neutrino fraction9 . If we
assume that the effect on the metric perturbations from the neutrinos is negligible,
we will expect the matter power-spectrum in a model with massive neutrinos to be
2
hδCDM i
if k < knr
(4.80)
P (k) =
2
(1 − ζ)2 hδCDM
i if k ≫ knr
with a region of smooth transition in between. But the effect of the massive neutrinos on the perturbed metric cannot be neglected, and using semi-analytic considerations, assuming that ζ ≪ 1, it can be shown that for modes with k ≫ knr
[55, 24]
P (k)|ζ − P (k)|ζ=0
≈ −8ζ.
(4.81)
P (k)|ζ=0
√
since we from the first Friedmann equation have that ΩΛ + Ωm a−3 ∝ H
9
The neutrino fraction is often denoted by fν , but since I have used this variable for the neutrino
distribution function, I will use ζ for the neutrino fraction.
8
4.9. MASSIVE NEUTRINOS AND STRUCTURE FORMATION
61
M =0.0 eV
ν
M =0.3 eV
ν
M =1.0 eV
ν
M =2.0 eV
ν
4
P(k) (Mpc/h)
3
10
3
10
2
10
1
10
−4
10
−3
10
−2
10
k (h/Mpc)
−1
10
0
10
Figure 4.3: The matter power-spectrum for different values of Mν . The changes
in Mν have been compensated by corresponding changes in ΩCDM such that Ωm
is equal in the different cases. The other cosmological parameters have been set
to typical concordance values. We see that from a certain scale the neutrinos start
to suppress the matter power spectrum. On larger scales the neutrino velocity is
negligible, and massive neutrinos will affect the matter power spectrum in the same
way as ordinary CDM. We also see that the suppression is mass-dependent and
almost constant on scales with k ≫ knr .The plots are produced using CAMB.
This suppression is a factor 4 larger than what one would expect from (4.80). In
the naive considerations leading to (4.80) the neutrinos where assumed to act independently of the rest of the universe. The enlarged effect of massive neutrinos
on the power spectrum is due to both the effects that were mentioned at the start of
this section:
• A larger ζ (and Ων ) will make aeq larger. Thus the matter perturbations will
have less time to grow. Sticking to a flat universe, an increase in Ων will
imply a reduction in ΩCDM . Assuming that neutrinos will stay relativistic
during radiation domination, aeq is given by
Ωγ
Ωr
≈ 1.1
.
(4.82)
ΩCDM + Ωb
ΩCDM + Ωb
Here Ωr contains the energy fraction of both photons and neutrinos today,
assuming that the neutrinos remain massless until today (e.g. that you can
trace the energy density back in time by ρr = ρr0 a−4 ).
aeq =
• That neutrinos do not cluster on scales with k ≫ knr will suppress the amplitude of the metric potentials ψ and φ slightly and thus slow down growth
of matter perturbations.
Chapter 5
Cosmological neutrino mass
limits
We have now established some concepts about neutrino masses, standard cosmology, perturbation theory, structure formation and how massive neutrinos affect
such structure formation. Now it is time to see how this knowledge can be used to
put cosmological constraints on neutrino masses. First I will in some detail consider the effect of massive neutrinos on the CMB power spectrum and how we can
constrain the neutrino mass from CMB experiments alone. Then I will add more
data to the analysis and see how this affects the mass limits within the standard
ΛCDM model. When these mass limits are established and everything looks nice
and beautiful, I will consider the effect of relaxing the constraint that dark energy
is a cosmological constant and allow for models with wX 6= −1. Finally I will
study the relation between cosmological neutrino mass limits and the HeidelbergMoscow result on neutrinoless double β-decay.
5.1 Massive neutrinos and CMB
This section is mainly a review of the results obtained in [26] by Ichikawa et al.
This was the first paper claiming to derive good upper limits on Mν from CMB data
alone. Before this, it was commonly believed that one needed additional data, e.g.
from LSS surveys, to derive any sensible limits. For instance, in the publication
of the results of the first-year data from the WMAP satellite [56], they did not put
any limit on the neutrino mass using only CMB data. However, in the publication
of their 3-year data [39] they confirmd the Ichikawa et al. result. In my own runs
with the public Markov chain Monte-Carlo code CosmoMC [57] (see Appendix C)
I also get similar results from CMB-data alone as in [26] and [39].
Although the paper [26] is mainly dealing with numerical techniques, I will
here focus on their analytical sections where they discuss why it is plausible to get
neutrino mass limits from CMB alone, and how much we can expect to tighten
such limits with better measurement of CMB anisotropies in the future. At the end
63
64
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
of the chapter I will briefly discuss the numerical results found and compare them
with my own CosmoMC runs.
5.1.1 Reduced CMB observables
In [26] they base their discussion mainly upon the “reduced CMB observables”
which were introduced in [58]. The four quantities they consider are the position
of the first peak, l1 , the height of the first peak relative to to the amplitude for l10 ,
H1 ≡
∆Tl1
∆Tl10
2
(5.1)
2
(5.2)
the amplitude of the second peak to the first,
H2 ≡
∆Tl2
∆Tl1
and the amplitude of the third peak relative to the first
H3 ≡
∆Tl3
∆Tl1
2
(5.3)
where (∆Tl )2 ≡ l(l + 1)Cl /2π and Cl is the multipole coefficient of the lth multipole in the temperature CMB field.
In the numerical simulations in [26] , they compare models with and without
massive neutrinos with the observed data for the reduced CMB observables. They
find that χ2 increases the more neutrino mass they add to the model, and that a
Mν
, larger than ∼ 0.02 is not compatible
neutrino mass density, ων = Ων h2 = 94.1eV
with observations. That corresponds to a total neutrino mass Mν . 2eV, which is
comparable to results found earlier by combining CMB and large scale structure
(LSS) observations [20] (see also section 2.4.3). In [26] they also state that it
will be difficult to improve this limit significantly by using CMB data alone. The
reasons for this will hopefully become obvious in the next section.
5.1.2 Analytic considerations on the effect of massive neutrinos
The position of the first peak
An important epoch when considering massive neutrinos is the epoch in which
the neutrinos become non-relativistic. In this epoch the neutrinos change from
behaving like radiation to behaving like CDM. This doesn’t happen in an instant,
nor does it happen to all of the neutrinos at the same time, but as a first (and
usually rather good) approximation, one may assume that all neutrinos become
non-relativistic as their average momentum pν becomes similar to the neutrino
5.1. MASSIVE NEUTRINOS AND CMB
65
mν
3 .
The corresponding redshift is given by
mass mν . That corresponds to Tν,nr =
1 + znr =
=
anr
a0
Tν,nr
Tν,0
1
3 mν
4 1/3
Tγ,0
11
=
=
3
4 1/3
11
1.602 × 10−19 J/eV
× 2.725K × 1.38 × 10−23 J/K
= 1.989 × 103 (mν /eV)
= 6.24 × 104 ων
(5.4)
Mν
3mν
In the last equality it is used that ων = 94.1eV
= 94.1eV
, assuming three neutrino species with degenerate masses. If Mν is anywhere close to the upper limit
considered here, the neutrino masses will indeed be degenerate, since the measurements of oscillations of solar and atmospheric neutrinos give mass square differences between the neutrino mass eigenstates ∆m221 = 7 × 10−5 eV2 and ∆m232 =
3 × 10−3 eV2 .
We can now compare this to the redshift at recombination, z∗ = 1088 [56],
which is insensitive to the neutrino mass since neutrinos decoupled from baryonic
matter a long time before this epoch. Neutrinos became nonrelativistic before recombination if znr > z∗ , that is
6.24 × 104 ων
ων
> 1089
(5.5)
> 0.017
(5.6)
which corresponds to Mν & 1.6eV. In the graphs from the numerical simulations
in [26] you can see that this value of ων corresponds to turning points for all of
the four reduced CMB observables, at least for H1 , H2 and H3 . This is intuitively
easy to understand. If the neutrinos become nonrelativistic after recombination,
the CMB is already produced at the time when the neutrinos expose their mass and
become nonrelativistic. This is why they claim in [26] that a tightening of the upper
bound of the neutrino mass below ων ≈ 0.017 requires use of other observables
like LSS.
2
3H02
where
ρ
=
We now denote the energy density like ω = Ωh2 = ρρh
cr,0
8πG .
cr,0
This gives the standard expressions for matter and photon density
ρm (a)h2
ρcr,0
ργ (a)h2
ρcr,0
= ωm a−3
(5.7)
= ωγ a−4
(5.8)
66
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
The photon density today is ωγ = 2.48 × 10−5 , provided T0 = 2.725K [59]. The
neutrino energy density ρν is given by integration over the Fermi-Dirac distribution.
ρν
gν
(2π)3
=
Z
Eν
e
Eν −µν
Tν
d3 p
(5.9)
+1
p
where Eν = p2ν + m2ν and µν is the chemical potential.
The chemical potential for νe is again assumed to be negligible. gν in (5.9)
is the number of degrees of freedom for neutrinos. Assuming three flavors and
their antiparticles, gν = 6, which is used in [26] . This is accurate for massless
neutrinos which have only one spin degree of freedom (left-handed). However,
massive neutrinos may oscillate into right-handed neutrinos, which might modify
gν slightly. They also assume that the denominator in (5.9) can be approximated
by eE/T → ep/T . Since neutrinos were highly relativistic when they decoupled
(at Tν ∼ 1MeV), both these approximations should be extremely good, as already
argued for in section 4.5.1.
Since (5.9) only depend on the norm of p we use that d3 p = 4πp2 dp. Defining
x = Tpνν we have
ρν
=
=
=
p
Z
6 × 4π ∞ p2 + m2ν p2 dp
(2π)3 0
ep/Tν + 1
Z ∞p 2 2
x Tν + m2ν x2 Tν3 dx
3
π2 0
ex + 1
p
Z
3Tν4 ∞ x2 + y 2 x2 dx
π2 0
ex + 1
(5.10)
where
y =
mν
Tν
= mν
= mν
11
3
11
3
1/3
1/3
Tγ−1
−1
a Tγ,0
(5.11)
Since the temperature scales like a−1 we have
4
Tγ,0
= ωγ
ρcr,0 4
T
ργ h2 γ
(5.12)
5.1. MASSIVE NEUTRINOS AND CMB
where I have used (5.8). That gives
4/3
Z ∞p 2
x + y 2 x2 dx
ρν h2
4
h2 3
−4 4
=
a
T
γ,0
ρcr,0
ρcr,0 π 2 11
ex + 1
0
p
4/3
Z
Tγ4 ∞ x2 + y 2 x2 dx
4
3
−4
a
ω
=
γ
π 2 11
ργ 0
ex + 1
4/3
Z ∞p 2
x + y 2 x2 dx
45 4
−4
=
a
ω
γ
π 4 11
ex + 1
0
67
(5.13)
2
Here I have used that ργ = π15 Tγ4 (which is obtained by integrating over the BoseEinstein distribution for a ultrarelativistic particle with two degrees of freedom).
Assuming that the vacuum energy is a cosmological constant we have
ρΛ h2
ρcr,0
= ΩΛ h2 = ωΛ
= h2 − ωm − ων
(5.14)
where the last line comes from the flatness assumption and neglecting the energy
density from radiation at late times. The total energy density is ρtot = ρm + ργ +
ρν + ρΛ . I will use conformal time given by 1
Z t
Z a ′
Z a
dt
da
da′
η(a) =
=
=
.
(5.15)
′ ′
′2
0 a
0 ȧ a
0 a H
Here H can be expressed in terms of ρtot using the 1st Friedmann equation. To find
the position of the mth peak one needs a full solution of the coupled Boltzmann
equations. But what is done here is to parametrize it like
lm = lA (m − φm )
(5.16)
where φm is a phase factor that depends on m, and lA is the acoustic scale defined
by
rθ (η∗ )
(5.17)
lA = π
rs (η∗ )
rθ (η∗ ) is the comoving angular diameter distance to the last scattering surface. In a
flat universe that is rθ (η∗ ) = η0 −η∗ . rs (η∗ ) is the sound horizon at recombination,
defined by
Z a∗
Z η(a∗ )
da′
(5.18)
cs (a′ ) ′2
cs dη =
rs (a∗ ) ≡
a H
0
0
where the cs is the sound speed in a fluid given by
s
1
cs =
3(1 + R)
1
here a prime does not mean differentiation with respect to η
(5.19)
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
68
and R is the baryon to photon ratio,
R=
3ρb
3aωb
=
4ργ
4ωγ
(5.20)
Since neutrinos are only weakly interacting, they do not have any effect on the
sound speed, but they will alter the sound horizon (5.18) through modification of
H. The main effect comes from the fact that a large ων will postpone the time of
matter-radiation equality2 , which will reduce slightly the sound horizon at recombination. To get an idea of how lA is varying with increasing ων I have plotted lA ,
rθ (ηrec ) and rs (ηrec ) as a function of ων in Figure 5.1. Here I also show the phase
factor φ1 for the first peak when considering massive neutrinos (to be discussed
later).
−19
305
1.5
x 10
1.48
300
1.46
1.44
l
A
rθ(η*)
295
1.42
1.4
290
1.38
1.36
285
1.34
280
0
0.01
0.02
0.03
ων
0.04
0.05
1.32
0.06
0
0.01
0.02
0.03
ων
0.04
0.05
0.06
−21
1.56
x 10
0.275
φ1(ξ)
φ1(r*)
1.55
0.27
1.54
0.265
φ1
rs(η*)
1.53
0.26
1.52
0.255
1.51
0.25
1.5
1.49
0
0.01
0.02
0.03
ω
ν
0.04
0.05
0.06
0.245
0
0.01
0.02
0.03
ω
0.04
0.05
0.06
ν
Figure 5.1: l1 and its constituents. l1 is defined by (5.16) and lA is defined by
(5.17). We see that both rθ (ηrec ) and rs (ηrec ) decay as ων increases, but that
rθ (ηrec ) decays faster, such that lA also decay. I have also plotted the phase factor
φ1 in the case where massive neutrinos contribute to the early ISW (φ1 (ξ)), and for
the case where they do not and φ1 (r∗ ) is constant.
The phase factor in (5.16) arises from the early integrated Sachs-Wolfe effect,
and as mentioned, it is nontrivial to determine it. The Sachs-Wolfe effect is due
2
as discussed in section 4.9.1.
5.1. MASSIVE NEUTRINOS AND CMB
69
to photons redshifting when traveling out of gravitational potential wells. These
potential wells are of course density perturbations, and the growth of those depends
on the amount of radiation that suppresses such growth through free-streaming.
Massless neutrinos are in this case behaving like a radiation part of the energy
density, while non-relativistic neutrinos will behave like CDM. So the phase shift
φm obviously depends on whether the neutrinos are relativistic or not, and hence
their mass. So φm is therefore a probe for the neutrino mass.
In [58] they give a fitting formula for φm ,
r 0.1
∗
(5.21)
φm ≈ bm
0.3
where r∗ =
ρr∗
ρm∗
is the radiation to matter density at recombination and
b1 = 0.267,
b2 = 0.24,
b3 = 0.35,
...
(5.22)
This fitting formula is obtained from fits to the first peak, assuming ωb = 0.02 and
massless neutrinos. Of course, here we are dealing with everything but massless
neutrinos, but the claim in [26] is that (5.21) will be a very good approximation
when splitting the neutrino energy density into a matter component and a radiation
component in a proper way. The proper way is to treat the neutrinos having momentum pν < mν as matter and the ones having momentum pν > mν as radiation.
ν
In (5.13) we use x = Tpν and y = m
Tν . So we have radiation when x > y and
matter when x < y. Thus
Z ∞p 2
ρν,r h2
45 4 4/3 −4
x + y 2 x2 dx
(5.23)
= 4
a ωγ
ρcr,0
π
11
ex + 1
y
Z yp 2
ρν,m h2
45 4 4/3 −4
x + y 2 x2 dx
a ωγ
= 4
(5.24)
ρcr,0
π
11
ex + 1
0
Using this we can define a new r-like quantity
ξ≡
ργ + ρν,r
ρm + ρν,m
(5.25)
which replaces r in (5.21). In Figure 5.2 I have tried to reproduce the results from
Figure 6 in [26] . In one case I have calculated φ1 using r∗ as defined earlier, and
in the other case I have replaced r∗ by ξ in the expression for φ1 given in 5.21.
Here it is shown that l1 (ων ) calculated in with the use of the ξ-variable follows the
ων = 0 curve quite accurately for ων < 0.017. This is in good agreement with
what we would expect, since, as discussed earlier, if ων < 0.017 the neutrinos were
relativistic all the time before recombination. In Figure 5.1 I have shown φ1 (ων )
for both cases. The results I find agree well with [26] . In the plot one easily sees
how the effect from massive neutrinos on the early ISW becomes significant for
ων > 0.017. We see clearly that a larger ων shifts the first peak to the left. This
is also confirmed by the numerical results from CAMB which are shown in Figure
5.3.
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
70
222
220
218
216
l
1
214
212
210
208
206
204
0
0.01
0.02
0.03
0.04
0.05
0.06
ων
Figure 5.2: Dependence of l1 on ων . The dashed line comes from including contribution from massive neutrinos to early ISW. In the solid line this effect is not
included. As expected the two graphs converge as ων becomes less than 0.017.
This figure corresponds to Fig. 6 in [26] .
7000
Mν=0.0 eV
Mν=1.0 eV
6000
l(l+1)Cl / (2 π) (µ K)
2
5000
4000
3000
2000
1000
0
0
200
400
600
800
1000
1200
Multipole moment l
1400
1600
1800
2000
Figure 5.3: Here the CMB power spectrum is shown with three massless neutrinos and with three neutrinos with Mν = 1.0eV. The increased energy density in
massive neutrinos is compensated by reducing ΩCDM . We see that adding massive
neutrinos enhance the first peaks and shifts the spectrum slightly to the left. The
matter power spectra were calculated with CAMB.
5.1. MASSIVE NEUTRINOS AND CMB
71
Heights of acoustic peaks
As already mentioned (and as discussed more thoroughly in section 4.9.1) free
streaming of massive neutrinos will smoothen out gravitational wells and suppress
perturbation growth on small scales, which in turn will enhance the baryonic oscillations. This results in larger temperature fluctuations within the neutrino freestreaming scale [26, 24]. Numerical calculations on the CMB power spectrum can
be found in Figure 5.3. The multipole corresponding to the free-streaming scale is
given by [60]
2πrθ (η∗ )
(5.26)
lnr ≈
ηnr
700
600
500
l
400
300
200
100
0
0
0.01
0.02
0.03
ων
0.04
0.05
0.06
Figure 5.4: This figure corresponds to Figure 7 in [26] . Here the multipole scale
corresponding to the neutrino free-streaming scale is shown.
In Figure 5.4 I show this lnr as a function of ων , just like they do in Figure 7
in [26] . For the “magic limit” of ων ≈ 0.017 this scale corresponds to lnr ≈ 300.
This implies that for this neutrino mass only the part of the CMB power-spectrum
with l > 300 is affected by the neutrino free-streaming effect.
We will now reduce the theory with massive neutrinos to an effective theory
where the more simple equations for massless neutrinos apply by introducing three
effective quantities, namely ω̃m , Ñν and h̃. ω̃ is made up by counting the nonrelativistic component of the neutrinos from (5.24) as ordinary CDM, and we have
ω̃m = ωm +
ρν,m (a∗ )
ων
ρν,r (a∗ ) + ρν,m (a∗ )
(5.27)
To ensure the same value for the matter-radiation equality and the same amount
of early integrated Sachs-Wolfe effect, we have to define an effective number of
72
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
neutrino species, Ñν . Using ργ =
1.469 × 106 h2 K4 we have
π2 4
15 T
and that ρcr = 8.098 × 10−11 h2 eV4 =
2.470 × 10−5
ργ h2
=
ρcr
a4
(5.28)
for Tγ,0 = 2, 725K. Assuming massless neutrinos, the neutrino energy density is
given by
7 4 4/3
ρν = Nν ×
ργ
(5.29)
8 11
The total radiation component is then given by
ρr h2
ρcr
=
=
h2
(ργ + ρν )
ρcr
1
× 1.235 × 10−5 (0.4542Nν + 2)
a4
(5.30)
Matter-radiation equality is given by the condition
which gives that
2
ρm h2
−3 ! ρr h ,
= ωm aeq =
ρcr
ρcr aeq
(5.31)
ωm × 80968
(5.32)
2 + 0.4542Nν
in the zero-mass case. Allowing massive neutrinos we have to replace ωm by ω̃m
and define an effective Nν
a−1
eq =
Ñν =
80950ω̃m aeq (ων ) − 2
0.4542
(5.33)
where aeq is calculated by setting ξ(aeq ) = 1 in (5.25). This is to allow for aeq to
change as neutrinos also contribute to the matter density.
Since CMB perturbations depend on h through ω = Ωh2 it is also useful to
define the effective Hubble parameter h̃2 , especially for getting the late time ISW
correct. For a flat universe we have (ωm + ων )h−2 + ΩΛ = 1. To get this on the
same form as in the massless theory we want an h̃ given by ωm h̃−2 + ΩΛ = 1, so
!
ωm h̃−2 = (ωm + ων) h−2
r
ωm
h̃ = h
ωm + ων
(5.34)
In [26] they plot H1 using this effective theory (using CMBFAST [61]), and it fits
very well the full numerical treatment that they have done. The results for H2 and
H3 using the effective theory are a lot worse. On these scales the free streaming
of massive neutrinos is important, and the effective theory does not include this
effect.
5.1. MASSIVE NEUTRINOS AND CMB
73
Concerning their four reduced CMB-variables l1 , H1 , H2 and H3 , it turns out
that both l1 and H1 responds to changes in neutrino mass also for ων < 0.017. This
is however not the case for H2 and H3 . These two variables are mostly sensitive to
the amount of free-streaming of neutrinos before recombination, and are found to
be nearly constant for ων < 0.017. The variation of l1 and H1 alone is not enough
to constrain ων a lot. The effect on l1 can be nearly canceled by decreasing h. This
will increase H1 , but this effect can be corrected for by altering ns and ωb , and all
this changes will leave H2 and H3 almost unaffected. Therefore tighter constraints
than ων < 0.017 are hard to find without additional data, such as LSS surveys, to
break parameter degeneracies.
Summing up, one can say that adding massive neutrinos to a standard ΛCDM
model alters the CMB power-spectrum by
• shifting the spectrum to the left. This effect is mainly due to the postponing
of matter-radiation equality which will reduce the sound horizon at recombination.
• enhancing the acoustic peaks. This is partly because of the postponed matterradiation equality, and partly due to an enhancement of the early integrated
Sachs Wolfe effect because neutrino free streaming will speed up the decay
of gravitational potentials.
5.1.3 Numerical results from CMB alone
The analytical discussion above was mostly to show how effects from massive
neutrinos appear in the CMB power-spectrum, and how good constraints we can
expect to get using CMB data alone.
In their full numerical treatment they find in [26] a limit on ων < 0.021 (95%
C.L.) using only the first year data from the WMAP satellite. This corresponds
to a limit on the sum of the neutrino masses of Mν < 2.0eV. This limit is found
using a Monte Carlo approach with ∼ 105 runs with CMBFAST. The WMAP team
confirmes this limit in the release of their three-year data [39] using a Monte Carlo
Markov chain (see Appendix C) approach based on CAMB.
I have also tested the limits on Mν from CMB data using the CosmoMC code
[57] which uses a Monte Carlo Markov chain approach with CAMB. Here I have
used the same three year data from WMAP as in [39], but I have also tried to
add data from the small scale CMB experiments ACBAR [43], CBI [44] and VSA
[45] to see if that improves the mass limits. In [26] and [39] the runs were done
assuming a standard ΛCDM model with neutrinos added. I have also tested the
robustness of the limits in one allows the equation of state parameter for dark energy, wX , to be different from −1 (but constant in time). This test is interesting
because the nature of dark energy is poorly understood, and thus one should try to
make as few assumptions as possible about it. In addition it was shown in a paper
by Hannestad in 2005 [35] that there are degeneracies between wX and Mν . The
results from my analysis are given in Table 5.1.3.
74
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
Data used
WMAP3
WMAP3, ACBAR, CBI, VSA
wX = −1
1.9 eV
2.0 eV
wX free
2.1 eV
2.1 eV
Table 5.1: Limits on Mν (95% C.L.) from CMB data alone using CosmoMC and
the three year WMAP data set. The limits are not not improved by adding small
scale data sets, and not much weakened by allowing for wX 6= −1.
For WMAP data alone and with wx = −1 I find Mν < 1.9eV at 95% C.L.
This is consistent with the limits from [26] and [39]. I also find that allowing
for wX 6= −1 has only a tiny effect on the neutrino mass limits when using only
CMB data. Adding data from small scale CMB experiments is not improving the
mass limits at all. Actually, for the case wX = −1 the limit is slightly weakened
when adding these data sets, but the effect is not very significant. This might be
due to some statistical uncertainty in the CosmoMC-code, or due to some slight
inconsistency in the CMB data sets. That the small scale CMB data are not very
useful for constraining Mν looks plausible when studying Figure 5.3. Here we see
that the power spectra for massive and massless neutrinos converge for large values
of l.
5.2 Cosmology and neutrino mass hierarchies
So far I have only discussed the cosmological effect of the total neutrino mass Mν .
This can be done under the assumption that the neutrino masses are degenerate,
which is a good approximation for Mν ≫ ∆mij , where ∆mij denotes the mass
differences between the individual neutrino mass eigenstates. In Chapter 2 we saw
that from oscillation experiments the largest neutrino mass difference is |∆m32 | ≈
0.05eV. This should be compared to the current best upper bounds from cosmology
of Mν < 0.17eV [34] and Mν < 0.3eV [33], and we see that the absolute mass
limit is less than an order of magnitude higher that the largest mass difference. In
Figure 5.5 I have plotted the single eigenstate masses as a function of Mν both
in the case of normal and inverted mass hierarchy. From the plots we see that
the assumption of degenerate masses is a rather crude approximation, and that it is
worthwhile to study whether it could be possible to distinguish the mass hierarchies
or detect the single neutrino masses by cosmological observations.
What the CMB spectrum is concerned, we have seen that the effect of neutrino
mass on the CMB power spectrum mainly is due to the shift of aeq to larger values.
And, as was pointed out in [26] and in section 5.1, with Mν < 1.6eV neutrinos will
become non-relativistic after recombination, and in that case information on neutrino masses will be hard to read out from the CMB power spectrum. So one would
not expect the CMB power spectrum to be much altered by adding information on
the mass splittings to our analysis.
5.2. COSMOLOGY AND NEUTRINO MASS HIERARCHIES
Seljak et al.
Goobar et al.
Seljak et al.
75
Goobar et al.
0
10
0
10
−1
10
−1
m (eV)
−2
i
i
m (eV)
10
10
−2
10
m
1
m
2
m
m
1
m
2
m
−3
10
3
3
−4
10
−3
−1
0
10
10
M (eV)
ν
10
−1
0
10
10
M (eV)
ν
Figure 5.5: The individual neutrino mass eigenstates m1 , m2 and m3 plotted as
a function of the total neutrino mass Mν in the case of a normal (left panel) and
inverted (right panel) mass hierarchy. The vertical lines are the current best upper
limit from cosmology from Seljak et al. [34] and Goobar et al. [33]. With Seljak et
al. limit the non-degeneracy level between m1 and m3 is at 30% in the NH scheme
and 40% in the IH scheme.
For LSS the situation is a bit more promising, since the power suppression
depends on the mass dependent free-streaming scale of the neutrino. For the nondegenerate case one would then expect to get three different free-streaming scales
associated with the matter power spectrum. But, as it is hard enough to detect this
“bend” in the matter power spectrum in the case of three degenerate neutrinos, one
would also here expect it to be extremely difficult to see the distinct effect of each
separate neutrino mass.
To do a quantitative analysis of the cosmological effect of the non-degeneracy
of the neutrino masses, one has to turn to the numerical Boltzmann codes once
again. Unfortunately non of the public Boltzmann codes today include the possibility of running with non-degenerate neutrino masses, and to do that full analysis
one has to modify one of these codes to handle these extra parameters. This has
been done in [15] modifying CMBFAST and in [16] with a modified version of
CAMB. The conclusion in both papers is that adding information on the neutrino mass splittings to the CMB and LSS power spectra modify the power spectra
(mainly the LSS spectrum) slightly, but that this modification is all too small to
be detectable with the data that we have today. They also find it unlikely that the
effect of non-degeneracy can be seen even when considering expected results from
future CMB and LSS experiments. This verifies that the assumption of degenerate
neutrino masses that we are usually making in cosmology is a valid approximation.
76
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
5.3 Mass limits including various data sets
As is seen from Table 2.1, the upper limits put on Mν from cosmology vary a lot.
This is partly because cosmological data sets have improved significantly over the
last years, partly due to differences in the data sets applied in the different analysis,
and partly due to different priors on the underlying cosmological model. These
priors both concern how many free parameters that are included and over which
range they are allowed to vary.
In this section I will focus on how the addition of extra data sets affect the
neutrino mass limits in a standard flat ΛMDM model, mainly based on my own
runs with CosmoMC. In the next section I will add an extra free parameter by
allowing for wX 6= −1.
In the analysis referred to in this section I have used an 8-parameter cosmological model with (ωb , ωCDM , θ, τ , Mν , ns , As , r) as free parameters. ωb = Ωb h2
and ωDM = ΩDM h2 accounts for the abundance of baryons and dark matter, respectively. ΩΛ is then given by the flatness assumption. θ is the ratio of the sound
horizon to the angular diameter distance (effectively the same quantity as introduced in 5.17). τ is the optical depth to the last scattering surface, accounting for
the fact that not all photons decoupled simultaneously (see e.g. [40]). Mν is the
good, old sum of the neutrino masses. ns is the scalar spectral index, as defined
in (3.32). As gives the amplitude of the primordial scalar fluctuations. r is the
primordial ratio of tensor to scalar fluctuations. This parameter is mainly leaving
a small imprint in the low multipoles of the CMB power spectrum, and setting this
to zero would probably not alter the limits on Mν significantly.
As already discussed, Mν is affecting the CMB and LSS power spectra directly,
and especially the LSS power spectrum is sensitive to the neutrino mass. Other
observables like the Hubble parameter and the redshift-luminosity relation for Sn1a
observations will not be directly affected by the neutrino mass to a large extent.
Anyway, having 8 free parameters in our cosmological model, these extra data sets
will prove to be important for constraining other parameters that have degenerate
effects with neutrinos in the CMB and LSS power spectra.
As an example we see from the CMB power spectra in Figure 5.3 that an increased Mν will tend to make the first acoustic peak higher. This effect can be
compensated by adding more dark matter and correspondingly reduce ΩΛ . This
would in turn affect the expansion history of the universe and shift the peaks to the
left, which again could be compensated by enforcing a smaller h. These effects are
shown for the CMB power spectrum in Figure 5.6. So the effect of increasing Mν
can to some extent be camouflaged if one has no constraints on h. Constraints on
h can be given e.g with data from HST key project or by Sn1a observations. The
importance of priors on h was pointed out in [62] where they provide good fits to a
Λ-less universe with large Ων with only CMB and LSS data and h < 0.5. The data
quality on CMB and LSS experiments has improved since then, and as I will show
in my analysis, the situation is not that extreme anymore, although priors on h still
are important to put good constraints on Mν .
5.3. MASS LIMITS INCLUDING VARIOUS DATA SETS
77
7000
ω
=0.1, h=68.7
CDM
ωCDM=0.2, h=68.7
ω
=0.2, h=40
CDM
6000
l(l+1)Cl / (2 π) (µ K)
2
5000
4000
3000
2000
1000
0
0
200
400
600
800
1000
1200
Multipole moment l
1400
1600
1800
2000
Figure 5.6: Increasing ωCDM (or decreasing ωΛ ) will lower the acoustic peaks and
shift them to the left. The horizontal shift can be compensated by decreasing h.
The power spectra are produced using CAMB.
First I will go back to the analysis done with CMB data only. As quoted in
section 5.1 the analysis on with the CMB-data from WMAP3, ACBAR, CBI, VSA
gave an upper limit on Mν < 2.0eV. In Figure 5.7 I show the corresponding 68%
and 95% confidence contours in the Mν -Ωm and Mν -h planes, where the other
parameters have been marginalized over.
Obviously a similar plot in the Mν -ΩΛ plane would look like the Ωm plot with
the contours pointing down to the right instead of up. It is clear from this figure
that a tight constraint on either an upper bound on Ωm or lower bound on h would
improve the Mν limit significantly. Also it is interesting to notice that a confirmed
low value of h < 65 would tend to also provide a cosmological lower limit on the
neutrino mass. I have also tried to add data from the Hubble Space Telescope key
project (HST) [63]3 to see how this will improve the limits on Mν . From just this
additional prior on h the 95% C.L. upper limit on Mν improves from Mν < 2.0eV
to Mν < 1.7eV.
Next I have explored the mass constraints when adding LSS data from SDSS
and 2dF to the CMB data sets (without the HST prior on h). As previously seen,
maybe the most distinct impact from massive neutrinos on cosmological observables appears in the LSS power spectrum through the suppression of small-scale
fluctuations due to free-streaming. Doing these CosmoMC runs, the new 95% C.L.
upper limit on the neutrino mass becomes Mν < 0.82eV, that is, an improvement
by a factor of more than 2. The new confidence contours in the Mν -h plane includ3
Where they based on Cepheid observables constrained h to 72 ± 8.
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
78
Created using CosmoloGUI
0.6
0.55
0.5
Ωm
0.45
0.4
0.35
0.3
0.25
0.2
0.5
1
1.5
2
2.5
3
M
ν
Created using CosmoloGUI
80
75
h
70
65
60
55
0.5
1
1.5
2
2.5
M
ν
Figure 5.7: 68% and 95% confidence contours in Mν -Ωm and Mν -h planes from
the CosmoMC runs with only CMB data (solid lines). The dotted contours show
the new confidence limits from adding a prior on h from HST. The other parameters
are marginalized over.
ing these two LSS data sets are shown in Figure 5.8. Although this addition of LSS
data improved the Mν limits significantly, the limit is still around twice as large as
the “standard” upper limits from cosmology on ∼ 0.4eV. This happens because we
still have too poor constraints on parameters like Ωm and h. So we need more data.
To push the limits further I added data from HST, Sn1a (Riess gold sample),
Ly-α and a Big Bang nucleosyntesis (BBN) constraint on ωb = 0.020 ± 0.002
5.3. MASS LIMITS INCLUDING VARIOUS DATA SETS
79
Created using CosmoloGUI
80
75
h
70
65
60
55
0.5
1
1.5
M
2
2.5
3
ν
Figure 5.8: 68% and 95% confidence contours in the Mν -h plane. The dashed line
comes from CMB and LSS data only. The solid contours include in addition HST,
Sn1a, Ly-α and BBN data. As a reference the contours from CMB data only are
shown (dotted lines).
4.
The Ly-α forest allows us to probe the matter power spectrum up to redshifts
of z ∼ 4. At such redshifts the non-linear effects on the perturbations kick in
on much smaller scales, which allows us to use smaller scales than when we are
comparing data and linear simulations at z ≈ 0. As neutrino mass effects are
more pronounced at small scales, Ly-α data are very tempting to use to constrain
neutrino masses. However there are still problems concerning the estimation of
systematic uncertainties related to the Ly-α data, so they should be treated with
some caution. The Sn1a data constrains Ωm since a certain value of ΩΛ is required
to explain the accelerated expansion indicated by the Sn1a observations. With the
inclusion of these new data and constraints, the neutrino mass limits reduces to
Mν < 0.47eV. Confidence contours in the Mν -h plane with this data are shown in
Figure 5.8. We see that the contours are significantly improved by the inclusion of
these new data sets. In the case of the Mν -h degeneracy this is to a large extent
caused by the higher preferred value for h from the HST measurement than from
the combined CMB and LSS data. And as seen in Figure 5.8 will higher value of
h give less space for a large Mν . In Figure 5.9 I show the probability distribution
for Mν and the confidence contours in the Mν -Ωm plane for this data. We see
that the degeneracy with Ωm has become less severe than in the case where I only
used CMB data. Since Ωm is further constrained above from e.g. the h and Sn1a
constraints. It also turns out that when using the Sn1a data in addition to the CMB
and LSS data, the HST constraint on h becomes less important. For instance, if
4
This analysis is based on knowledge of the deuterium abundance today. See [64].
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
80
one uses the same data as above but without the HST constraint, the upper limit on
the neutrino mass will only increase to Mν < 0.51eV.
Created using CosmoloGUI
1
0.9
0.8
Probability
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
M
ν
Created using CosmoloGUI
0.42
0.4
0.38
Ωm
0.36
0.34
0.32
0.3
0.28
0.26
0.24
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
M
ν
Figure 5.9: Results from the analysis with data from CMB, LSS, Sn1a, HST and
BBN measurements. The upper panel shows the probability distribution of Mν
with the 68% and 95% confidence limits given as vertical bars. The lower panel
shows the 68% and 95% confidence contours in the Mν -Ωm plane.
In the recent reference [34], where they provide the spectacular mass limit
limit Mν < 0.17eV, their improved limit is mainly due to putting stronger priors
on the Ly-α data than what has bee done in previous analysis. Here they have used
priors on quasar spectra to put stronger constraints on the amplitude of the matter
5.4. DARK ENERGY WITH wX 6= −1
81
spectrum from the Ly-α data than what has been done earlier. A discussion on
whether this is well justified or not is without the scope of this text. But as they
show in the paper, this technique results in what looks like a slight inconsistency
between constraints on the power amplitude from WMAP3 and Ly-α (at the ∼ 2σ
level), and it is timely to discuss whether using these two data sets simultaneously
can be justified.
Another additional constraint is to use the baryonic acoustic oscillations (BAO)
that are detected in the power spectrum of luminous red galaxies in the SDSS data.
Comparing the LSS baryonic “bump” with the baryonic peaks in the CMB power
spectrum provides us with an additional cosmological ruler for relating positions
in angular and redshift space to physical distances. The main effect of using BAO
in constraining Mν is that it tightens the allowed range of ωm . BAO was first used
for constraining the neutrino mass in reference [33].
5.4 Dark energy with wX 6= −1
In the last section all the analysis were done in the framework of a standard 8
parameter cosmological model, and I only focused on the effects of applying different data sets. Now I will add an extra free parameter to my cosmological model,
namely the equation of state parameter for dark energy, wX .
The cosmological neutrino mass limits shown in table 2.1 are all derived assuming that dark energy obeys a cosmological constant equation of state, wX =
−1. The reason why most authors assume wX = −1 is mainly due to the facts that
• wX = −1 is theoretically tremendously beautiful as it corresponds to a cosmological constant. Especially models with wX < −1 (phantom energy)
are theoretically/philosophically unappealing as the increasingly rapid expansion within a finite time will make all the matter in the universe rip apart
in “the big rip” [65].
• Cosmological observations tend to favor a dark energy equation of state corresponding to wX ≈ −1.
• wX = −1 is extremely easy to incorporate in calculations. For example will
a cosmological constant not cluster, and when doing perturbation theory one
only needs to account for it in the zero-order background evolution.
None of these reasons should be convincing enough to exclude the possibility that
wX 6= −1. Current observational constrains on wX are for instance given in
[39] as −1.001 < wX < −0.875 (with WMAP3, LSS and Sn1a). However,
as they point out, these constraints depend on the data sets used and on the assumptions on dark energy clustering. In the derivation of this limit they also assumed vanishing neutrino masses. Including massive neutrinos, the limits reduces
to −1.16 < wX < −0.93. Both these limits were found assuming that wX is
constant in time.
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
82
The first paper to examine the degeneracy between wX and Mν was [35]. Here
one allowed for a cosmological model with both wX 6= −1 (but constant in time)
and massive neutrinos. It was found that the upper limit on Mν increased from
0.65eV to 1.48eV at 95% C.L. (using WMAP1, SDSS, SN1a and HST). The increased upper limit on neutrino mass corresponds to allowing for a wX below −1,
e.g. that the dark energy is in the phantom regime.
The physical explanation for the degeneracy is related to the Mν -Ωm degeneracy in the CMB and LSS power spectra. In the case of CMB, increasing Mν will
tend to enhance the peaks in the CMB power spectrum. This can be compensated
by increasing ΩCDM . This increment in ΩCDM will in turn imply a smaller value
for ΩΛ . With wx = −1 and such a small ΩΛ , the model quickly becomes incompatible with Sn1a data which require a certain size of ΩΛ to explain the accelerated
expansion rate. However, if one allows for wX < −1, this acceleration can be
accommodated with a smaller dark energy density fraction ΩDE .
Trying to reproduce the results from [35] I analyzed the same parameter and
data sets as in [35]. In contradiction to their result of Mν < 1.48eV, I found
Mν < 1.24eV. I am not certain on the reason for this discrepancy, but it might be
due to a difference in the handling of dark energy perturbations. The confidence
contours in the Mν -wX plane that I found are shown in Figure 5.10.
Created using CosmoloGUI
−1
−1.5
w
x
−2
−2.5
−3
−3.5
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
M
ν
Figure 5.10: 68% and 95% confidence contours in the Mν -wX plane. The dotted
lines correspond to the intended reproduction of the results in [35] using WMAP1,
SDSS, HST and Sn1a. The solid contours result from including WMAP3, 2dF,
small scale CMB, BBN and a larger Sn1a sample (Riess gold sample) in the analysis.
Next I tried to include the 3-year WMAP data, small scale CMB data, 2dF
and the BBN constraint on ωb . This reduced the mass limit to Mν < 0.58eV.
5.5. THE RELATION BETWEEN THE 0νββ RESULT AND COSMOLOGICAL MASS LIMITS83
Compared to the result I got with the same data sets and wX = −1, Mν < 0.57eV,
the degeneracy does not look severe at all anymore. The importance of adding
WMAP3 results is significant. With only WMAP1 data the limit increased from
0.58eV to 0.82eV. The importance of including WMAP3 results can be understood
by tighter constraints on wX in the 3-year data and that the best-fit value of wX has
been pushed up from wX ≈ −0.98 to wX ≈ −0.93 in their 3-year results. This
higher value for wX will favor a smaller Mν . The 95% limits on wX from the data
used in my analysis is −1.00 < wX < −0.70, which explain why the degeneracy
is disappearing. The confidence contours for this extended data set are given in
Figure 5.10 together with the more limited data set as used in [35].
In the recent paper [33] they pointed out that the use BAO data to a large extent
resolved the possible Mν -wX degeneracy problem, since it adds constraints that are
almost orthogonal to the Sn1a constraints in the Ωm -wX plane. For an extensive
11 parameter cosmological model with a free number of relativistic species, Nν , a
free running of the scalar spectral index, αs and free wX , they found that adding
information on BAO the neutrino mass limit decreased from Mν < 2.3eV to Mν <
0.48eV. In addition to BAO they utilized data from CMB (first-year WMAP), LSS
and Sn1a. In a more constrained model with Nν = 3, αs = 0 and wX = −1, and
adding Ly-α data the limit reduced to the frequently quoted Mν < 0.30eV.
In reference [66] they analyze the ability of future CMB experiments to extract
information on the gravitational lensing potential. Here they also claim that such
information will lift much of the Mν -wX degeneracy.
In [35], [33] and in my analysis, only models with constant wX are considered.
In [67] they also allow for a time varying wX parametrized as
wX (a) = w0 + (1 − a)w1 ,
where a is the scale factor. They find that Mν is not strongly correlated to the
time dependency of wX (that is, w1 ), so a first order extension of the assumed
constancy of wX will not lead to severe additional problems for the cosmological
neutrino mass limits.
Concluding this section one can say that the problem with degeneracy between
Mν and wX that was presented in [35] to a large extent is resolved. This is mainly
due to the inclusion of BAO data, but more accurate data sets like WMAP3 preferring larger values of wX than before, also making the picture brighter for the
robustness of the cosmological neutrino mass limits.
5.5 The relation between the 0νββ result and cosmological mass limits
As mentioned in section 2.4.2 there is a claim for a detection of the effective
electron neutrino mass hmνe i = (0.1 − 0.9)eV (99.7% C.L.) by the HeidelbergMoscow (HM) neutrinoless double β-decay experiment. In the following section
2.4.3 I referred to cosmological limits on the sum of the neutrino masses down to
84
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
Mν < 0.17eV. It is interesting to see how this cosmological mass observable is
connected to the 0νββ mass observable and then see if they are compatible with
each other.
Using (2.18) and (2.17) we have
X
2
hmνe i =
Uei
mi
i
= cos2 θ13 cos2 θ12 m1 + cos2 θ13 sin2 θ12 m2 + sin2 θ13 m3 (5.35)
where the CP-phase in (2.17) has been omitted. From the small mass splittings
inferred from neutrino oscillation experiments, we know that in the range of cosmological neutrino mass limits, the mass eigenstates are close to degenerate. To
get an estimate of the scales involved, we now assume total mass degeneracy
(m1 = m2 = m3 ≡ m). Inserted in (5.35) this gives
hmνe i ≈ cos2 θ13 cos2 θ12 m + cos2 θ13 sin2 θ12 m + sin2 θ13 m
= m cos2 θ13 (cos2 θ12 + sin2 θ12 ) + sin2 θ13
= m
(5.36)
which is no big surprise given the degeneracy assumption. To a reasonable approximation we then may say that
Mν ≈ 3hmνe i
(5.37)
So it does not look like the HM result and the cosmological mass limits are compatible. The detailed relation between Mν and hmνe i depends on the exact value
of the involved mixing angles, θ12 and θ13 , the mass differences ∆m212 and ∆m223 ,
and whether the mass scheme is hierarchical or inverted. In Figure 5.11 Mν is
shown for variation of some of the other parameters. It is clear that these variations
do not rescue us from the inconsistency, and that the by far most important uncertainty when it comes to predicting Mν from hmνe i lies in the uncertainty of the HM
result and the corresponding uncertainty in the nuclear matrix elements involved.
As already mentioned, the HM result is still very controversial, but it will be
checked by new and more sensitive experiments in a few years, and if turns out to
be confirmed, maybe even with a better accuracy, it will certainly have implications
for cosmology.
In the following I will assume that the HM detection is correct and see how this
affects cosmological parameters. I will assume that their limits on hmνe i derive
from a Gaussian distribution. I will further assume that the mass eigenstates are
degenerate, such that Mν = 3hmνe i = 0.3 − 2.7eV at 99.7% C.L. I implement this
Gaussian prior on Mν in the getdist analyzing tool supplied with CosmoMC, and
compare the analysis of my CosmoMC runs with and without this HM prior on Mν .
The comparison concerns both the change of preferred parameter values, especially
for Mν , but also whether the sets of cosmological data used are consistent with the
HM results. Obviously, since the 3σ region from HM and the 2σ cosmological
5.5. THE RELATION BETWEEN THE 0νββ RESULT AND COSMOLOGICAL MASS LIMITS85
3
normal hierarchy
inverted hierarchy
1.16
2.5
1.14
2
Mν (eV)
ν
M (eV)
1.12
1.5
1.1
1.08
1
1.06
0.5
normal hierarchy
1.04
inverted hierarchy
0
0.1
1.02
0.2
0.3
0.4
0.5
0.6
0.7
⟨ mν e ⟩ (eV)
0.8
0.9
2
2.2
2.4
2.6
2
23
∆m
1.18
2.8
3
2
3.2
−3
x 10
(eV )
1.16
normal hierarchy
inverted hierarchy
normal hierarchy
inverted hierarchy
1.16
1.14
1.14
1.12
Mν (eV)
Mν (eV)
1.12
1.1
1.1
1.08
1.08
1.06
1.06
1.04
1.04
1.02
0.24
0.25
0.26
0.27
0.28
0.29
2
sin θ12
0.3
0.31
0.32
0.33
0.34
1.02
0
0.02
0.04
0.06
0.08
0.1
0.12
2
sin θ13
P
Figure 5.11: Here i mν = Mν is shown for variation in hmνe i, ∆m223 , sin2 θ12
and sin2 θ13 . hmνe i is varied over its 99.73% CL intervall, while the other
three parameters are varied over their 90% CL interval. When one parameter
is varied the other ones are held fixed at their best fit values hmνe i = 0.36eV,
∆m223 = 2.4 × 10−3 eV, sin2 θ12 = 0.282 and θ13 is set to 0. The upper solid
lines represent a normal hierarchical mass scheme, while the lower dashed lines
represent an inverted hierarchy.
limit from [33] do not overlap, using the full range of cosmological data sets is not
consistent with HM. So it is interesting to see how much cosmological data one
can include without drastically increasing the − log(L) of the model, L being the
likelihood function (See Appendix C).
I have analyzed one cosmological model with wX = −1 and one with wX as
a free parameter. For each model I have applied four different combinations of
data sets defined in Table 5.2. The results are listed in Table 5.3 for the case of
wX = −1 and in Table 5.4 for the case of a free wX .
Notice that the difference in − log L when applying the HM prior is negligible
with data sets 1 and 2, where only CMB data are used. But when adding cosmological data on LSS the difference increases significantly, and adding HST, BBN and
Sn1a data the difference becomes even worse. It also turns out that adding wX as
a free parameter is not making the the cosmological model flexible enough to give
consistency with the HM result 5 .
5
It is a very interesting feature about these data worthy a small comment. It appears that using
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
86
1
2
3
4
Data sets used
WMAP3
WMAP3, VSA, ACBAR, CBI
WMAP3, VSA, ACBAR, CBI, 2dF, SDSS
WMAP3, VSA, ACBAR, CBI, 2dF, SDSS, HST, BBN, Sn1a
Table 5.2: The for different combinations of data sets used in the analysis of cosmological data with a HM prior.
Data set
1
2
3
4
HM prior?
no
yes
no
yes
no
yes
no
yes
Mν (eV)
0 − 1.93
0.79 − 1.93
0 − 1.99
0.82 − 1.90
0 − 0.82
0.33 − 1.29
0 − 0.57
0.18 − 0.84
− log L
5625.9
5626.2
5636.9
5637.1
5668.7
5671.7
5695.7
5700.0
∆ log L
0.3
0.2
3.0
4.3
Table 5.3: 95% Confidence limits on Mν with the different combinations of data
sets listed in Table 5.2 in a cosmological model with wX = −1. The resulting
limits on Mν together with the likelihood for the best fit sample is given both
when applying the HM prior and when only using cosmological data. We see that
the difference in likelihood becomes significant as fast as LSS data is taken into
account.
Data set
1
2
3
4
HM prior?
no
yes
no
yes
no
yes
no
yes
Mν (eV)
0 − 2.1
0.83 − 1.88
0 − 2.1
0.86 − 1.92
0 − 0.79
0.30 − 1.14
0 − 0.58
0.17 − 0.92
− log L
5625.9
5626.1
5636.9
5637.0
5665.7
5668.8
5695.6
5700.0
∆ log L
0.2
0.1
3.1
4.4
Table 5.4: Corresponding to Table 5.3, except that I here have used a cosmological
model where wX is taken as a free variable. Compared to the model with wX =
−1, adding this extra degree of freedom does not help significantly in making
cosmology and the HM result more compatible when using cosmological LSS data.
data set number 3, the limit on Mν is slightly tighter in the case of a free wX . This is contraintuitive. One would not expect that adding more free parameters would tighten limits on the other
parameters. The reason for this is that with this combination of data sets the 95% C.L. on wX
becomes −0.92 < wX < −0.40, which actually means that data set 3 rules out a cosmological
constant at the 2σ level. Using the full data set 4, wX = −1 reenters the 95% C.L. This hints that
there might be some inconsistencies in the data already before the HM prior is taken into account.
Although this looks really exciting, I will leave this problem here for the moment.
5.5. THE RELATION BETWEEN THE 0νββ RESULT AND COSMOLOGICAL MASS LIMITS87
This is not surprising. The 99.7% C.L. on the HM prior of 0.3eV < Mν <
2.7eV corresponds to a 95% C.L. of 0.7eV < Mν < 2.3eV. If the data were
consistent one would of course expect a significant overlap of the 95% regions,
which is not the case as soon as the cosmological upper limit becomes smaller than
∼ 1eV, which happens when one adds LSS data.
Since the HM result is not commonly accepted as a detection of hmνe i yet, we
can as cosmologists sweep the problem under the rug and say that cosmology has
ruled out the HM result. So what then, if the HM result is confirmed by another
experiment like GERDA? We could still get away with it by saying that cosmology
has ruled out the 0νββ matrix elements. But if we get yet another confirmation by a
totally different experiment like the KATRIN, which measures tritium β-decay, and
they detect, say, Mν ≈ 1.2eV, then this would have implications for cosmology.
At this time we will probably have available data from the Planck-satellite,
which will have different systematic errors than WMAP. So if Planck confirms and
strengthens the results from WMAP, it will be difficult to discard the CMB data.
When it comes to LSS measurements there is little doubt that the galaxies that
we count are where they are, so even if better LSS surveys can improve the current
data, it is very unlikely that they will contradict them. Loosening the constraint that
the bias parameter should be approximately constant on the linear scales may help,
but that would in turn contradict the simulations that have predicted the constancy
of b, and a b that is far from constant on the relevant scales seems hard to fit into the
current cosmological model. So it seems difficult to fit in a large Mν by blaming
observational systematics.
The next possibility is to add more free parameters. As shown above, wX is
not a good solution. And as was shown in [67], adding a simple time variation to
wX is unlikely to loosen the limits on Mν much further. One could assume that the
primordial power spectrum is not described by a simple power law in k. To first
order this can be accounted for by introducing the a running of ns , αs . However, in
[33] it was shown that Mν is not very sensitive to this parameter either. The same
was shown to be the case with the effective number of relativistic particles, Nν . An
exotic option is the possibility of mass varying neutrinos (MaVaNs) coupled to a
scalar field which makes their mass dependent on the local energy density in the
universe. Such a model was introduced in [68]. Due to its non-standard features,
this model is hard to accept, but it serves as an example of how one always can
come up with new parameters that can resolve cosmological problems and even
give some clues to what to expect from a fundamental particle theory.
The easiest way out would be to just add the detected neutrino mass as an input
parameter in our cosmology and pretend like nothing6 . If there is no simple way
to incorporate the large neutrino mass into cosmology without adding some really
nasty-looking extra parameters, Bayesian selection methods would probably still
6
This would reduce the likelihood of the model, but the cosmological best-fit parameters would
not change a lot. From the runs with data set 4 I find that an introduction of a HM prior will, by
reasons explained earlier, make Ωm increase slightly. ΩΛ would decrease correspondingly, and h
would also decrease slightly. The rest of the parameters would remain almost unchanged.
88
CHAPTER 5. COSMOLOGICAL NEUTRINO MASS LIMITS
find a simple ΛCDM model as the best one because of its small number of free
parameters and the relative small loss in likelihood by adding the heavy neutrinos.
Or one could use the new information on the neutrino mass as an important input
for a revision of the current cosmological limit, which could be a very interesting
process.
But the HM result is still not confirmed, and perhaps it will fail to be. Both cosmology and particle experiments are approaching a detection of the neutrino mass,
it is likely that some day in the not too far future both particle experiments and
cosmology will come up with a neutrino mass detection. A good correspondence
between the results would in that case be a tremendous success both for particle
physics for cosmology, and it would be a good indication that we are on the right
track with our cosmological models.
Chapter 6
Summary and outlook
I have studied several aspects of the role massive neutrinos are playing in cosmology.
First I reviewed the experimental evidence and constraints on neutrino masses.
From particle experiments we have good constraints on the mass differences between
the different neutrino mass eigenstates, but only weak limits on the absolute mass
scale. Cosmology is at leading order sensitive to the sum of the absolute masses of
the neutrinos, Mν , and is therefore an excellent additional probe for the nature of
neutrino masses.
Although neutrinos probably make out less than 1% of the total energy density
in the universe today, they might be “seen” by cosmological observations because
of their peculiar inprint related to their small mass. The most pronounced effect of
massive neutrinos is found in the matter power spectrum. Because of their small
masses and large velocities, neutrinos will free-stream an supress structure growth
on small scales. The amout of supression increases proportionally with the neutrino mass, while the scales where the suppression begins is reduced with larger
neutrino mass. Thus, by the absence of such small scale suppression in the observed matter power spectrum, one can infer an upper limit on Mν . Deducing such
a limit requires good knowledge of what the power spectrum would look like in the
absence of massive neutrinos, so it is important to have tight constraints on other
cosmological parameters like Ωm and h. Therefore the inclusion of more than just
LSS data sets is crucial to put good limits on Mν .
The extremely good quality of observations of the CMB power spectrum makes
it invaluable to constrain a lot of parameters, and I have not tried to place limits on
Mν without the use of CMB data. I have examined the effect of adding more data
sets to the analysis. It turned out that it was important to use more than just CMB
and LSS data to put tight limits on Mν . Sn1a observations and priors on h from
the Hubble Space Telescope proved useful due to their ability to constrain ΩΛ , Ωm
and h.
I also tried to include the equation of state for dark energy, wX as a free parameter. It was shown in [35] that this should weaken the upper limits on Mν sig89
90
CHAPTER 6. SUMMARY AND OUTLOOK
nificantly. I found a degeneracy between wX and Mν , though it was slightly less
severe than claimed in [35], even when I applied the same data sets (e.g. only first
year WMAP data). Adding more data sets, like the 3-year WMAP data, the degeneracy became significantly less severe. This is probably related to tighter lower
bounds for wX found by the WMAP team in their 3-year data [39]. An analysis
using information on the baryonic acoustic peak in the LSS power spectrum, also
finds that this extra information is breaking the Mν − wX degeneracy. So this
degeneracy is becoming less and less severe for cosmological neutrino mass limits.
Finally I studied the relation between the cosmological neutrino mass limits and the claim of a detection of the effective electron neutrino mass in the
Heidelberg-Moscow neutrinoless double β-decay experiment. I showed that this
result is hard to fit into our standard cosmology, even if we only are using CMB
and LSS data. Leaving wX as a free parameter did not resolve this problem. A
possible confirmation of the HM-result close to its best fit from future experiments
would therefore be an important input for a revision of our cosmological standard
model.
Further constraining of the neutrino mass from cosmological observations will
come “automatically” in the coming years through the improvement of CMB, LSS
and Sn1a data sets. But one should also look for new data to use and better ways
to exploit the existing data sets. The detection and use of BAO in [33] is a good
example of how new types of data can improve the limit on Mν by breaking degeneracies. But to continue improving the limits it seems to be important to probe
the matter power spectrum at higher redshifts to be able to study smaller scales
without having to deal with troublesome non-linearities. This can be done with
higher redshift LSS surveys, or by using the current and future data on the Ly-α
forest. The latter has already been done in [34] where they used knowledge of
quasar luminosities to constrain the amplitude of the Ly-α power spectrum and
found an upper limit on Mν < 0.17eV. But there are still uncertainties related to
the systematic errors of such a use of the Ly-α power spectrum, so one should be
careful in drawing too strong conclusions from this data.
From just a glance at Table 2.1 one can see an enormous improvement in the
cosmological neutrino mass limits over the last few years. And we are now at
a stage where we are approaching the lower mass limit from neutrino oscillation
experiments. It is not unlikely that we will detect the neutrino mass from within
a few years. Doing this would be an important input of data to particle physics,
as well as it would yield an impressing strengthening of the current cosmological
standard model, especially if the detection corresponds to an independent detection
from an earth-based particle experiment. A possible contradiction between particle
experiments and cosmology would on the other hand be an important input for a
possible revision of our cosmological model. In any case, neutrino cosmology will
undoubtedly be an active, evolving and exciting field in the coming years.
I will end with a figure (Figure 6.1) showing the evolution of the cosmological
upper limits on Mν from the past few years, where some empty space is left on the
right hand side for free extrapolation.
2.5
Cosmological upper limit
Lower limit from oscillation experiments
Elgarøy et al.
2
M
ν
1.5
1
Hannestad et al.
Crotty et al.
Seljak et al.
0.5
Goobar et al.
0
2003
2004
2005
2006
2007
2008
Year
Figure 6.1: Some typical cosmological upper limits on Mν from the past few years.
Appendix A
Some comments on model
dependency in cosmology
Every natural science has to deal with interpretation of data. This cannot be done
independently of a model or a theory. This is because one will never, not even in
principle, be able to make a really direct measurement of anything in nature. Every
measurement is to some extent indirect, and thus model dependent.
A.1 Model dependency and indirectness
If a scientist claim to know a quantity, she is strictly speaking only referring to a
perception of what that quantity is inside her own head.
One may for example claim to know that the number displayed on a computer screen is “42”, but then one is of course only interpreting the electromagnetic
waves coming from the screen and the way those waves are refracted by the eye
and how they trigger nerve impulses to the brain. The point here is not to start
digging into the huge field of philosophy of human knowledge, but rather to stress
that every observation, no matter how trivial it may seem, carries with it at least a
small portion of indirectness.
If we now say that the number “42” represents, for example, the voltage over
an electric semiconductor measured by a voltmeter, the level of indirectness has
already increased enormously. The result “U = 42V” is now depending for example on the theory of electromagnetism used for designing the voltmeter. Still we
seldom find any reason to distrust our result, because the properties of a voltmeter
and the technology on which it is based has been tested thoroughly over the years
using the hypothetic-deductive method. If we now for example consider Ohm’s
law to be a part of our model that we do not really trust, we could say that “The
voltage is 42 within the framework of Ohm’s law”, making some of our model
dependency more explicit.
It is clear that every step which makes a measurement more indirect will increase its model dependency.
93
94APPENDIX A. SOME COMMENTS ON MODEL DEPENDENCY IN COSMOLOGY
A.2
Problems appearing in cosmology
The problems mentioned above will of course apply to every natural science, but
they will often be more severe in cosmology than in many other sciences, since
cosmology is really a cutting-edge physical science when it comes to applying
extremely indirect measurements. Thus the measurements are usually also very
model dependent.
For example, if we claim that luminosity-redshift relation for Supernovae type
1a (Sn1a) proves the existence of dark energy, we are among lots of other things
assuming that
• Sn1a have a consistent relation between light curves and absolute luminosity
which we have understood well by observation of close SN1a.
• the laws of physics are the same at different redshifts.
• neutrino mass is a constant, and that it is not dependent on the local mass
density1 .
• the laws of gravitation are the same on large scales as on the scales that have
been tested inside our solar system.
• we have homogeneity and isotropy on large scales (e.g. we are not living in
the middle of a an underdense bubble, see for example [69]).
This list could have been made infinitely long. Of special interest is the fact that
we can never make a complete list of the “lots of other things” that we assume.
For example, hardly nobody were aware that they were assuming constant neutrino
masses before someone came up with the idea that they may not be constant after
all.
Of course cosmology is cursed with a huge model dependency. After all we
are working with things more far away and larger than anything we know, and we
can never test our predictions by running the universe over again with different
initial conditions or by waiting some billion years to see what happens next. We
cannot stop doing cosmology for this reason. We just have to be aware of some of
the major assumptions made, and hopefully have some ideas of how sensitive our
results are to deviations from these assumptions.
Below I will comment upon some of the problems related to model dependency
in cosmology.
A.2.1 On the border of becoming an exact science
For many years the only concern in cosmology was to find the most likely cosmological model given the known data and physical theories. The focus was to find
a framework within which one could understand the most important properties of
1
which could mimic a cosmological constant. See for example [68]
A.2. PROBLEMS APPEARING IN COSMOLOGY
95
the universe. After the recent years’ explosion in data from important cosmological
observables like CMB, LSS and Sn1a, there has been an increasing consensus that
the universe is probable to be something close to a flat ΛCDM universe, homogeneous and isotropic on large scale. Assuming that this model is correct, one can
then use the new data for pinning down physical quantities like the neutrino mass.
There is of course nothing wrong with this approach, it it just to say that “assuming
that X is the correct cosmological model, then the sum of the neutrino masses has
to be smaller than Y”. If one wants to see how model dependent the mass limit is,
one can try to vary the different parameters in the model and see how this affects
the limit. Anyway, since there is an infinite number of ways to alter the model by
for example introducing new parameters, there is also an infinite number of models
that will fit our observational data, so one can never test and quantify the model dependency totally. Stating that some quantity is determined in a “model independent
way” is therefore a meaningless statement unless one specifies within which class
of models the quantity is model independent.
A.2.2 Feedback when trying to verify a model
So why are we saying that there is an increasing consensus of the universe being
something close to the flat ΛCDM model? This is because we claim that among
more the CMB, LSS and Sn1a observations are all consistent with the predictions
made by this model.
One problem with a reasoning like this is that every observation has to be interpreted within a model. So say we are having a sort of concordance model. Then
we have a set of observations that we translate to physical quantities applying this
model. In the end we use these deduced physical quantities to verify our model.
So we are using our assumptions to verify the same assumptions. Of course, there
is no other way to test models. It would be even worse to use different models for
deducing physical quantities and for testing against the quantities found.
How severe this problem is in practical cosmology depends on the modeldependency of the observations used for the verification. Above it was argued
that it is impossible to quantify the total model dependency of a measured physical
quantity. This is not the place for arguing about whether human beings can know
anything about anything or not, but there is no doubt that because of the scientific
difficulties in cosmology (related to predictability and repeatability) we have to
be aware of this problem. Physical wrong cosmological models may be verified
within their own framework through a such a feedback mechanism, even if the
same observables interpreted through the knowledge of the correct model would
have ruled them out.
A.2.3 Self-maintenance of popular models
The feedback-problem above was of a philosophical character. Another problem
of a more practical character is the problem of self-maintenance of cosmological
96APPENDIX A. SOME COMMENTS ON MODEL DEPENDENCY IN COSMOLOGY
models.
It is a well-known and thoroughly discussed problem in science that a popular
concordance model commonly accepted in the scientific community is extremely
hard to change due to human resistance against paradigm shifts. In a science like
cosmology you also have another and more technical problem that will lead to the
same self-maintenance of popular models.
As already commented cosmologically measured quantities are in general heavily model dependent. Since much of the objective of cosmology is to find the best
cosmological model, it is of great interest to test new models against data all the
time. The problem then is that since the former obtained physical quantities are
found within an old model, one cannot simply use those physical quantities to test
the new model. In general one should reinterpret all the cosmological data within
the new framework that is going to be tested, and then finally see if the model fits
the deduced physical quantities.
Since the universe is rather large and complex, this is an immense task unless
one is really convinced that one is on the right track. For example one has to
write a new code ala CMBFAST calculating CMB-power-spectra, since a code
like CMBFAST only allows for limited ways to alter the flat ΛCDM model. Since
more and more simulations and techniques are centered around the ΛCDM model,
it is becoming increasingly hard to test a new model properly if it differs too much
from the ΛCDM model. This is not due to fear for new paradigms, but just a
practical problem with large and complex systems.
A.2.4 Selecting the right model
Having a cosmological model like the ΛCDM model that is fitting the data rather
good, one can always add an extra free parameter and get at least an equally good
fit, and probably an even better fit than in the old model. By adding infinitely
many extra parameters one could fit any kind of data perfectly. To select the “best”
cosmological model it is common apply Bayesian selection techniques which will
disfavour the inclusion of extra parameters unless they improve the fit to the data
significantly. This can be done under the assumption that the nature should be
“simple” at its most fundamental level. In the end, all we can do in natural science
is to make as simple models as possible that fit the observational data. Whether
this model represent the "true nature” or not, is without the reach of science to find
out.
Or, as Robin Dunbar has written in the context of explaining from an evolutionary point of view why humans tend to be religious:
“There is little to be gained by having an explanation that is so complex or difficult to confirm that we waste valuable time on it when we
could be out foraging or finding mates.”
-Robin Dunbar [70]
Appendix B
dq
Derivation of dη
dq
We want to find an expression for the dη
term in (4.38). To do this we have to
start with the geodesic equation. In the derivation, I will use the “old” energy and
momentum variables, E and p, and t as a time component and then convert to ǫ,
q and η in the end. When using t as a time variable, the g00 component will be
1 + 2φ (without the a2 in front).
The time component of the geodesic equation is given by [71]
dP 0
+ Γ0αβ P α P β = 0
dλ
(B.1)
We rewrite the first term as
dP 0 dt
dP 0
=
= Ṗ 0 P 0
dλ
dt dλ
Using (4.35) we can now write the geodesic equation as
d
P αP β
[E(1 − φ)] = −Γ0αβ
(1 − φ).
dt
E
(B.2)
(B.3)
Multiplying both sides by (1 + φ) we find
dE
dt
P αP β
dφ
− Γ0αβ
(1 + 2φ)
dt
E
∂φ
∂φ dxi
P αP β
= E
+ i
(1 + 2φ)
− Γ0αβ
∂t
∂x dt
E
∂φ
∂φ p̂i p
P αP β
= E
+ i
(1 + 2φ)
− Γ0αβ
∂t
∂x aE
E
= E
(B.4)
where I have used (4.39) to get to the last line. For calculating this last term we
can use the expression for the Christoffel symbols in terms of the metric given in
(3.17). We then have
Γ0αβ
1
P αP β
P αP β
= g0λ [gαλ,β + gβλ,α − gαβ,λ ]
.
E
2
E
97
(B.5)
APPENDIX B. DERIVATION OF
98
dq
dη
Since P α and P β are symmetric in α and β the first two terms in the brackets
will be equal. Since our metric (4.34) is diagonal, we see that we will only get a
contribution when λ = 0. Then the two first terms in the brackets in (B.5) will only
give contributions from the g00,β = 2φ,β components. That gives us
Γ0αβ
P αP β
1
P αP β
= (1 − 2φ) [4φ,β −gαβ,0 ]
.
E
2
E
(B.6)
Here the last term can be expanded like
−gαβ,0
P αP β
E
= −g00,0
P 0P 0
−
E
P iP j
E
gij,0
|{z}
∂
=− ∂t
[a2 (1−2ψ)δij ]
P iP j
= −2φ,0 E − 2a2 δij ψ,0 −4ψδij aȧ + 2ȧaδij
E
h
i P iP j
.
(B.7)
= −2φ̇E − 2a2 δij ψ̇ − H(1 − 2ψ)
E
From (4.37) we find
p
p2
p2
P i = p̂i (1 + ψ) ⇒ δij P i P j = 2 (1 + ψ)2 ≈ 2 (1 + 2ψ).
a
a
a
(B.8)
Using this in (B.7) and (B.6) yields
P
Γ0αβ
αP β
E
o
1 − 2φ
p2 n
P 0P β
ψ̇ − H(1 − 2ψ) (1 + 2ψ)
=
− 2φ̇E − 2
4φ,β
2
E
E
p2 β
= (1 − 2φ) 2φ,β P − φ̇E −
ψ̇ − H .
(B.9)
E
Inserting this result into (B.4) we find
dE
dt
1 dE
E dt
p̂i p
= E φ̇ + φ,i
− (1 − 2φ)
aE
β
2φ,β P
| {z }
i
=2φ̇E(1−φ)+2φ,i p̂ap (1+ψ)
p2
−φ̇E − (ψ̇ − H)
E
p̂i p
p̂i p
p2
− 2φ̇(1 − φ) − 2φ,i
(1 + ψ) + 2 (ψ̇ − H)
aE
aE
E
2
i
p
p̂ p
+ 2 (ψ̇ − H).
= −φ,i
aE
E
= 2φ̇ + φ,j
(B.10)
dq
. We start by converting to an
What we are looking for is an expression for dη
p
dp
p
2
2
expression for dt . From E = m + p we have that dE
dt = ṗ E , and we find that
E
d
p = φ,i p̂i + (ψ̇ − H)
dt
a
(B.11)
99
Changing to our preferred variables ǫ, q and η and using that
q ′ a−a′ q
a2
d
dt
=
1 d
a dη
and
d
dη p
=
we have
ǫ
1
d
p = φ,i p̂i + q(ψ̇ − H)
dη
a
a
d
1 ′ 1
q = φ,i p̂i ǫ + aq( ψ − H) + Hq
dη
a
a
′
= φ,i p̂i ǫ + qψ
which is the result that we were looking for.
(B.12)
Appendix C
MCMC and CosmoMC
Here I will review some of the properties of Markov chain Monte Carlo (MCMC)
methods and how this is applied in the CosmoMC code which I have used for cosmological parameter estimation. First of all I will present the likelihood function
which is central in the CosmoMC runs.
Unless other references are given, this appendix is based on the references [72]
and [57].
C.1 The likelihood function
An important quantity when analyzing cosmological data and comparing it to cosmological models is the likelihood function, L, which is defined as the probability
of a data set given a model (see e.g. [49, 72]). Say we have a vector of observable quantities z = (z, . . . , zn ) and a vector of unobservable model parameters
θ = (θ1 , . . . θd ). In our case z will consist of data from e.g. CMB and LSS experiments and θ will be a set of free cosmological parameters like Mν , h, ΩΛ etc.
Then the likelihood function is given by
L = P (z|θ)P (θ).
(C.1)
Here P (θ) contains possible information on priors on θ. This could for instance
bee a HST prior on h, a BBN prior on ωb , or a Heidelberg-Moscow prior on Mν .
Other priors could for instance be given by the flatness assumption. But what we
really are interested in are the theoretical parameters given some data, P (θ|z), and
not the other way around. This quantity can easily be found from L using Bayes’
theorem1 :
L
P (z|θ)P (θ)
(C.2)
=
P (θ|z) = R
m(z)
P (z|θ)P (θ)dθ
1
Named after Thomas Bayes, British mathematician (1702-1761), also known for the work “Divine Benevolence, or an Attempt to Prove That the Principal End of the Divine Providence and
Government is the Happiness of His Creatures” (1731).
101
APPENDIX C. MCMC AND COSMOMC
102
where m(z) is called the marginal over z. We see that P (θ|z) ∝ L, which means
that for most practical purposes we can use L instead of P (θ|z) when comparing
the likelihoods of different parameter sets given the data. To find the probability distribution for one of the parameters θi in our parameter vector, one has to
integrate the total probability function over all the other parameters:
Z
Z
P (θi |z) = . . . P (θ|z)dθ1 . . . dθi−1 dθi+1 . . . dθd .
(C.3)
This process is referred to as marginalization. I often refer to two-dimensional
probability plots, where probability contours are shown in a two-parameter plane.
In this case the P (θ|z) has been integrated over all but the two relevant parameters.
For comparing models with different parameter values a standard method is to
P
−zi )2
, where Ei denotes the
quote χ2 values, where χ2 is defined by χ2 ≡ i (EiE
i
value expected by the theoretical model. There is a simple relation between χ2 and
L given by [73]
χ2 ≈ −2 log L
(C.4)
under certain smoothness conditions that we don’t have to worry too much about
in our physical problems.
C.2
CosmoMC
Much of the data analysis in this master thesis is done with the public available code
CosmoMC [57]. This code uses a Markov chain Monte Carlo (MCMC) approach
with multiple runs of the Boltzmann code CAMB [53] (which in turn is based on
the code CMBFAST [61]).
A Boltzmann code like CAMB takes an input of cosmological parameters and
calculates cosmological quantities like the CMB and LSS power spectra using linear perturbation theory and line of sight integration [61]. Having these theoretical
spectra, one may compare the given theoretical model with observational data and
calculate the corresponding value of the likelihood function.
A typical cosmological model has around d ∼ 10 free parameters. Finding the
preferred ranges for all these parameters requires knowledge of L in a huge number of points in this d-dimensional parameter space, even if one is interested in the
probability distribution for just one of the parameters, say for example the neutrino mass. This calls for an efficient method for selecting which parameters to run
the Boltzmann code with to get reliable parameter limits with as few evaluations
as possible. This is where the MCMC method enters the stage. MCMC methods have proved to be extremely efficient in getting a relevant sample of points in
high-dimensional parameter spaces. It also has the property that its runtime ideally
scales approximately linearly in d, not exponentially as a standard grid based technique does.
A Markov chain is a discreet mathematical chain of distributions in a parameter
space. Its next position in parameter space is only based on its present position. If
C.2. COSMOMC
103
one chooses a smart way to generate the Markov chain, it can be given the property
of always converging to a stationary distribution. For practical applications one
usually creates chains by letting single random walkers move through the parameter space, instead of using the distribution directly. In turns out that this random
walker after a sufficiently large number of steps will sample the total probability
distribution.
In CosmoMC we want the parameter distribution to converge to the preferred
values of our theoretical parameters given the data sets that we are considering. We
name this stationary distribution P (θ) where θ still is a vector in our d-dimensional
parameter space. The algorithm used in CosmoMC for making the chain converge
to this distribution is the Metropolis-Hastings algorithm.
In general the random walker will move from one point in parameter space θ1
to the next one θ2 with a transition probability T (θ1 , θ2 ). The Metropolis-Hastings
algorithm is a method to generate this transition probability in such a way that
the chain will end up at the desired stationary value. This is done based on a
proposal density q(θn , θn+1 ). Ideally this proposal density is close to the stationary
distribution P , which will make convergence a lot faster. In our case we often have
some good clues for which stationary distribution to expect, and we can feed the
algorithm with a rather good initial proposal density. It is common to give the chain
a “burn-in” period to converge before starting the sampling.
The proposal density is used to propose a new point θn+1 for the random
walker to move to. Then the Metropolis-Hastings algorithm takes the proposed
point through an acceptance test. The proposed move will be accepted with a probability α(θn , θn+1 ). The transition probability is then given by
T (θn , θn+1 ) = α(θn , θn+1 )q(θn , θn+1 ).
(C.5)
The acceptance probability α(θn , θn+1 ) is
P (θn+1 )q(θn+1 , θn )
.
α(θn , θn+1 ) = min 1,
P (θn )q(θn , θn+1 )
(C.6)
That means, roughly speaking, that the probability of acceptance is reduced if the
proposed move will bring the chain to a place with a smaller value of L. This will
then over time make the chain converge to its stationary value.
How this works in praxis in CosmoMC is that a random walker starts in a random position in the parameter space θ 0 . Then CosmoMC calls CAMB for calculating the CMB and LSS power spectra and the background cosmological evolution
with the parameter set θ 0 . CAMB compares its theoretical power spectrum with
the desired data sets and outputs a likelihood value to CosmoMC. Then CosmoMC
uses the assumed parameter distribution put in by hand by the user and picks a new
random set of parameters θ 1 . CAMB is called and calculates the likelihood for θ 1 .
On the background of L(θ 1 ) and L(θ 0 ) CosmoMC will have to decide whether to
accept this new point in its chain or not. A random number 0 < u < 1 is generated,
104
APPENDIX C. MCMC AND COSMOMC
and θ 1 is accepted in the chain if u satisfies
P (θ 1 )P (z|θ 1 )q(θn , θn+1 )
u < min 1,
.
P (θ 0 )P (z|θ 0 )q(θn , θn+1 )
(C.7)
If the proposed θ 1 is accepted, the new parameter vector will be placed in the
chain and the process run over again with a new proposed θ 2 . If the proposed θ 1 is
rejected, θ 1 will be set to the same value as θ 0 , and a new point,θ 2 , will be proposed
and tested.
In CosmoMC one typically uses a burn-in time of ∼ 1000 chain steps. After
this burn-in time the chain should have converged to its stationary distribution
which is used for calculating the parameter confidence intervals. For the random
walker, the parameter values in θ l depends on the parameter vector in the last step,
θ l−1 . Thus neighboring steps in the chain are correlated. Since we want uncorrelated samples for making the parameter analysis, one usually keeps only every
10th to 1000th element in the chain for later analysis. This process is referred to as
chain thinning.
The remaining sample that we want to analyze is then the full chain minus the
removed samples from the initial burn-in period and the samples removed in the
thinning process. The probability for e.g. the neutrino mass Mν to be in a certain
bin is now the number of chain samples within this bin divided by the total number
of samples.
CosmoMC distinguishes between “fast” and “slow” parameters. The fast parameters are parameters like amplitudes and quantities governing the shape of the
primordial power spectrum. Changing a fast parameter does not require a total
recalculation of the linear perturbation equations. The slow parameters are the
parameters governing the evolution of perturbations, like ΩΛ and Mν . Changing
a slow parameter requires a total recalculation of the power spectra with CAMB.
This splitting into fast and slow parameters is used to make the code more effective. Only parameters from either the fast or slow group are changed at a time,
decreasing the required number of time consuming runs with CAMB.
One of the problems in MCMC is to know when the chain has converged sufficiently. Therefore it is efficient to run several chains simultaneously and let the
individual chains run until they are converging against the same distribution. This
variance of chain means
is done by defining a parameter R = mean
of chain variances for every parameter, using
the last half of the generated chain. For each step in the chains, the worst value of
R is compared with a user-defined upper limit. As fast as the worst R is below this
limit, the chains will stop. In my runs this upper limit has been set to R − 1 = 0.02.
My runs of CosmoMC were done on the Titan cluster at the University of Oslo,
running eight chains in parallel.
Bibliography
[1] David O. Caldwell, editor. Current Aspects of Neutrino Physics. Springer,
2001.
[2] F. Mandl and G. Shaw. Quantum Field Theory. Wiley, 1993.
[3] Fayyazuddin and Riazuddin. A Modern Introduction To Particle Physics.
World Scientific, 1992.
[4] John A. Peacock. Cosmological Physics. Cambridge, 1999.
[5] R. N. Mohapatra. Physics of the neutrino mass. New Journal of Physics,
6(82), 2004.
[6] Heinrich Päs. Neutrino masses and particle physics beyond the standard
model. Ann. Phys., 9, 2000.
[7] MINOS collaboration. Web page. http://www-numi.fnal.gov/.
[8] M. Apollonio et al. Limits on neutrino oscillations from the CHOOZ experiment. Physics Letters B, pages 415–430, 1999.
[9] F. Boehm et al. Search for neutrino oscillations at the palo verde nuclear
reactors. Phys. Rev. D, 84(17), 2000.
[10] Yi-fang Wang. Recent results of non-accelarator-based neutrino experiments.
Int. J. Mod. Phys., A20:5244–5253, 2005, hep-ex/0411028.
[11] V. Berezinsky, M. Narayan, and F. Vissani. Mirror model for sterile neutrinos.
Nucl.Phys.B, 658:254–280, 2003.
[12] K. Zuber. Experimental neutrino physics. Int. J. Mod. Phys., A20:2895, 2005,
hep-ex/0502039.
[13] John N Bachall and Carlos Peña-Garay. Solar models and solar neutrino
oscillations. New Journal of Physics, 6(63), 2004.
[14] Hisakazu Minakata and Hiroshi Nunokawa. Inverted hierarchy of neutrino
masses disfavored by supernova 1987a. Phys. Lett., B504:301–308, 2001,
hep-ph/0010240.
105
106
BIBLIOGRAPHY
[15] Julien Lesgourgues, Sergio Pastor, and Laurence Perotto. Probing neutrino
masses with future galaxy redshift surveys. Phys. Rev., D70:045016, 2004,
hep-ph/0403296.
[16] A. Slosar. Detecting neutrino mass difference with cosmology. 2006, astroph/0602133.
[17] B. Povh, K. Rith, C. Scholz, and F. Zetsche. Particles and Nuclei, An Introduction to the Physical Concepts, 4th edition. Springer, 2003.
[18] H.V. Klapdor-Kleingrothaus et al. Search for neutrinoless double beta decay
with enriched 76ge in gran sasso 1990 u20132003. Phys. Lett. B, 586:198–
212, 2004.
[19] H. V. Klapdor-Kleingrothaus. First evidence for neutrinoless double beta
decay - and world status of double beta experiments. 2005, hep-ph/0512263.
[20] Ø. Elgarøy and Ofer Lahav. Neutrino masses from cosmological probes. New
J. Phys., 7:61, 2005, hep-ph/0412075.
[21] Ofelia Pisanti and Pasquale D. Serpico. Neutrinos and cosmology: An update.
AIP Conf. Proc., 794:232–235, 2005, astro-ph/0507346.
[22] Sergio Pastor. Neutrino mass bounds from cosmological observables. 2005,
hep-ph/0505148.
[23] S. Hannestad. Introduction to neutrino cosmology. neutrinos in cosmology.
Prog. Part. Nucl. Phys., 57:309–323, 2006, astro-ph/0511595.
[24] J. Lesgourgues and S. Pastor. Massive neutrinos and cosmology. 2006, astroph/0603494.
[25] V. Barger et al. Effective number of neutrinos and baryon asymmetry from
BBN and WMAP. Phys. Lett. B, 566(8), 2003.
[26] M. Fukugita K. Ichikawa and M. Kawasaki. Constraining neutrino masses by
CMB experiments alone. Phys. Rev., D71:043001, 2005, astro-ph/0409768.
[27] Ø. Elgarøy et al. New upper limit on the neutrino mass from the 2 degree
field galaxy redshift survey. Phys. Rev. Lett., 89, 2002.
[28] Steen Hannestad. Neutrino masses and the number of neutrino species from
WMAP and 2dFGRS. JCAP, 2003.
[29] Max Tegmark et al. Cosmological parameters from SDSS and WMAP. Phys.
Rev. D, 69, 2004.
[30] P. Crotty, J. Lesgourgues, and S. Pastor. Current cosmological bounds on
neutrino masses and relativistic relics. Phys. Rev. D, 69(123007), 2004.
BIBLIOGRAPHY
107
[31] V. Barger, D. Marfatia, and Adam Tregre. Neutrino mass limits from SDSS,
2dFGRS and WMAP. Phys. Lett. B, 595:55–59, 2004.
[32] U. Seljak et al. Cosmological parameter analysis including SDSS ly-alpha
forest and galaxy bias: constraints on the primordial spectrum of fluctuations,
neutrino mass, and dark energy. Phys. Rev. D, 71(103515), 2005.
[33] A. Goobar, S. Hannestad, E. Mortsell, and H. Tu. A new bound on the neutrino mass from the SDSS baryon acoustic peak. 2006, astro-ph/0602155.
[34] U. Seljak, A. Slosar, and P. McDonald. Cosmological parameters from combining the lyman-alpha forest with CMB, galaxy clustering and SN constraints. 2006, astro-ph/0604335.
[35] Steen Hannestad. Neutrino masses and the dark energy equation of state:
Relaxing the cosmological neutrino mass bound. Phys. Rev. Lett., 95:221301,
2005, astro-ph/0505551.
[36] G.C. Branco and J.I. Silva-Marcos. Patterns for the neutrino mass matrices
and mixings. In Recent Developements in Particle Physics and Cosmology,
NATO science series, pages 63–87, 2001.
[37] E. Kh. Akhmedov et al. Neutrino masses and mixing with seesaw mechanism
and universl breaking of extended democracy. Phys. Lett. B, 498:237–250,
2001.
[38] R. N. Mohapatra. Understanding neutrino masses and mixings within the
seesaw framework. 2003, hep-ph/0306016.
[39] D. N. Spergel et al. Wilkinson Microwave Anisotropy Probe (WMAP) three
year results: Implications for cosmology. 2006, astro-ph/0603449.
[40] Finn Ravndal. Lecture notes from FYS5130 - Cosmological Physics.
http://folk.uio.no/josteirk/FYS5130/, 2005. Course given
at Department of Physics, University of Oslo 2005.
[41] H. S. Kang and G. Steigman. Cosmological constraints on neutrino degeneracy. Nucl. Phys. B, 372, 1992.
[42] A. D. Dolgov et al. Cosmological bounds on neutrino degeneracy improved
by flavor oscillations. Nucl. Phys. B, 632:363, 2002.
[43] Chao-lin Kuo et al. High resolution observations of the cmb power spectrum
with acbar. Astrophys. J., 600:32–51, 2004, astro-ph/0212289.
[44] Timothy J. Pearson et al. The anisotropy of the microwave background to l =
3500: Mosaic observations with the Cosmic Background Imager. Astrophys.
J., 591:556–574, 2003, astro-ph/0205388.
108
BIBLIOGRAPHY
[45] K. Grainge et al. The CMB power spectrum out to l=1400 measured by the
VSA. MNRAS, 341:L23, 2002, astro-ph/0212495.
[46] John E. Ruhl et al. Improved measurement of the angular power spectrum
of temperature anisotropy in the CMB from two new analyses of BOOMERANG observations. Astrophys. J., 599:786–805, 2003, astro-ph/0212229.
[47] Patrick McDonald et al. The linear theory power spectrum from the lymanalpha forest in the sloan digital sky survey. Astrophys. J., 635:761–783, 2005,
astro-ph/0407377.
[48] V. F. Mukhanov, H.A. Feldman, and R.H. Brandenberger. Theory of cosmological perturbations. Phys. Rep, 215, 1992.
[49] Scott Dodelson. Modern Cosmology. Academic Press, 2003.
[50] Morad Amarzguioui. Cosmological perturbation theory and gravitational entropy. Master’s thesis, Physics Department, University of Oslo, 2003.
[51] J. M. Bardeen.
Gauge-invariant cosmological perturbations.
Rev.,D22(8):1882, 1980.
Phys.
[52] C.P. Ma and E. Bertschinger. Cosmological perturbation theory in the synchronous and conformal newtonian gauges. Astrophys. J., 455:7, 1995.
[53] A. Lewis, A. Challinor, and A. Lasenby. Efficient computation of CMB anisotropies in closed FRW models. Ap. J., 538:473, 2000.
[54] Daniel J. Eisenstein et al. Detection of the baryon acoustic peak in the largescale correlation function of SDSS luminous red galaxies. Astrophys. J.,
633:560–574, 2005, astro-ph/0501171.
[55] Wayne Hu, Daniel J. Eisenstein, and Max Tegmark. Weighing neutrinos with
galaxy surveys. Phys. Rev. Lett., 80:5255–5258, 1998, astro-ph/9712057.
[56] D. N. Spergel et al. First year wilkinson microwave anisotropy probe
(WMAP) observations: Determination of cosmological parameters. Astrophys. J. Suppl., 148:175, 2003, astro-ph/0302209.
[57] A. Lewis and S. Bridle. Cosmological parameters from CMB and other data:
a Monte- Carlo approach. Phys. Rev., D66:103511, 2002, astro-ph/0205436.
[58] W. Hu et al. Cosmic microwave background observables and their cosmological implications. Astrophys J., 549:669–680, 2001.
[59] J.C. Mather et al. Calibrator design for the COBE far-infrared absolute spectrophotometer (FIRAS). Astrophys. J., 512:511, 1999.
[60] W. Hu and N. Sugiyama. Toward understanding CMB anisotropies and their
implications. Phys. Rev. D, 51:2599, 1995.
BIBLIOGRAPHY
109
[61] Uros Seljak and Matias Zaldarriaga. A line of sight approach to cosmic microwave background anisotropies. Astrophys. J., 469:437–444, 1996, astroph/9603033.
[62] Ø. Elgarøy and O. Lahav. The role of priors in deriving upper limits on neutrino masses from the 2dFGRS and WMAP. JCAP, 0304:004, 2003, astroph/0303089.
[63] W. L. Freedman et al. Final results from the Hubble space telescope key project to measure the Hubble constant. Astrophys. J., 553:47–72, 2001, astroph/0012376.
[64] S. Burles, K. M. Nollett, and M. S. Turner. What is the BBN prediction for
the baryon density and how reliable is it? Phys. Rev., D63:063512, 2001,
astro-ph/0008495.
[65] R. R. Caldwell, M. Kamionkowski, and N. N. Weinberg. Phantom energy
and cosmic doomsday. Phys. Rev. Lett., 91:071301, 2003, astro-ph/0302506.
[66] J. Lesgourgues, L. Perotto, S. Pastor, and M. Piat. Probing neutrino
masses with CMB lensing extraction. Phys. Rev., D73:045021, 2006, astroph/0511735.
[67] K. Ichikawa and T. Takahashi. On the determination of neutrino masses and
dark energy evolution. 2005, astro-ph/0510849.
[68] R. Fardon, A. E. Nelson, and N. Weiner. Dark energy from mass varying
neutrinos. JCAP, 0410:005, 2004.
[69] H. Alnes, M. Amarzguioui, and Ø. Gron. An inhomogeneous alternative to
dark energy? 2005, astro-ph/0512006.
[70] R. Dunbar. The Human Story - A new history of mankind’s evolution. Faber
and Faber Limited, 2004.
[71] Ø. Grøn. Lecture notes on general relativity. Course compendium, University
of Oslo, 2004.
[72] N. Christensen, R. Meyer, L. Knox, and B. Luey. Ii: Bayesian methods
for cosmological parameter estimation from cosmic microwave background
measurements. Class. Quant. Grav., 18:2677, 2001, astro-ph/0103134.
[73] J. R. Rice. Mathematical Statistics and Data Analysis, 2nd ed. Wadsworth,
1995.
Download