
Bayesian Epistemology
PHIL 218/338
Welcome and thank you!
Part I: What is Bayesian epistemology?
 Probabilities
 The
as credences
axioms of probability
 Conditionalisation
Part II: Applications and problems:
 Theism
Bear with me! Ideally we would discuss these
topics over several lectures.
What is Bayesian Epistemology?
Bayesianism is our “leading theory of uncertainty”
Alan Hájek and Stephan Hartmann
It concerns credences, or degrees of belief, which are often
I’m not going to be attacked by a duck tomorrow
Bayesianism ≈ a theory about when our credences are rational or
justified (one which may complement other theories of justification)
There are many varieties of Bayesianism
(Irving Good calculated that there are at least 46,656!)
Bayesian epistemology is the “application of Bayesian methods to
epistemological problems.”
First component of Bayesianism:
Probabilities as credences
Traditional epistemology deals primarily with qualitative
In Bayesian epistemology, these binary concepts are
arguably less central and therefore receive less
Bayesian epistemology deals largely with a quantitative
concept of credences
Credences ≈ degrees of belief or disbelief
First component of Bayesianism:
Probabilities as credences
In the 17th century, mathematicians Blaise Pascal and Pierre de
Fermat pioneered a representation of uncertainty as probabilities
Subjective interpretation of probability:
Subjective interpretation: ‘Probability is degree of belief’
But whose degree of belief?
Some actual person or
Some ideal person
This is the subjective or personal interpretation of probability because
these probabilities concern the psychological state of a subject or
= hypothesis/proposition
 ~h
= negation of the hypothesis
 P(h)
= It will rain tomorrow
 P(h)
= probability of the hypothesis
= Probability that it will rain tomorrow
These terms are on your handout
Quantitative nature of credences
Credences (or subjective probabilities) are taken to be associated with a
numerical value or an interval
P(h) - decimal
P(h) in %
P(h) in normal language
P(~h) in normal language
Quantitative nature of credences
Credences (or subjective probabilities) are taken to be associated with a
numerical value or an interval
P(h) - decimal
P(h) in %
P(h) in normal language
P(~h) in normal language
h is certainly true
~h is certainly false
Quantitative nature of credences
Credences (or subjective probabilities) are taken to be associated with a
numerical value or an interval
P(h) - decimal
P(h) in %
P(h) in normal language
P(~h) in normal language
h is certainly true
~h is certainly false
h is certainly false
~h is certainly true
Quantitative nature of credences
Credences (or subjective probabilities) are taken to be associated with a
numerical value or an interval
P(h) - decimal
P(h) in %
P(h) in normal language
P(~h) in normal language
h is certainly true
~h is certainly false
h is certainly false
~h is certainly true
h is probably true
~h is probably not true
Quantitative nature of credences
Credences (or subjective probabilities) are taken to be associated with a
numerical value or an interval
P(h) - decimal
P(h) in %
P(h) in normal language
P(~h) in normal language
h is certainly true
~h is certainly false
h is certainly false
~h is certainly true
h is probably true
~h is probably not true
h is probably not true
~h is probably true
Measuring credences
Consider your credence that h, the sun will
rise tomorrow
Consider your credence that you will (after
random selection) draw a red marble from
an urn containing
5 red marbles
5 black marbles
Are you more confident that the sun will rise
If yes, then P(h)>.5
Measuring credences
Consider your credence that h, the sun will
rise tomorrow
Consider your credence that you will (after
random selection) draw a red marble from
an urn containing
90 red marbles
10 black marbles
Are you more confident that the sun will rise
If yes, then P(h)>.9
Measuring credences
Consider your credence that h, the sun will
rise tomorrow
Consider your credence that you will (via
random selection) draw a red marble from
an urn containing
9,999 red marbles
1 black marble
Are you more confident that the sun will rise
If yes, then P(h)>.9999
Measuring credences
What about your credence that:
It will rain tomorrow
You will be attacked by a duck tomorrow
Maybe an interval might represent your credences better
If h = It will rain tomorrow
Then P(h) = [.6, .7]
What do you think?
Can all of our credences be represented with numerical values?
Objections to the subjective
The probability of h given some evidence e does not mean someone’s actual
credence since there may be no actual credence that is relevant
It’s not clear that the probability of h given some evidence e is the credence
of some epistemically rational agent
When is an agent’s credence epistemically rational?
When their credence for h given e equals the (inductive) probability of h given e?
When their belief is not blameworthy from an epistemic point of view?
This is uniformative! (Patrick Maher)
But someone might accidentally mistake the probability of h given e to be low and not be
blameworthy, but still the probability of h given e might be high (Patrick Maher)
Isn’t this just like saying “A proposition is true if and only if an omniscient God were
to believe it?” – It’s uninformative
Inductive probabilities are conceptual primitives – they can
be understood, but not expressed in terms of other simpler
concepts (Patrick Maher)
Probabilities are relative frequencies, which we might loosely
understand as the proportion of the time that something is
true (the frequentist interpretation of probability)
80% of the time when a student sits this course, it is true that they
60% of the time when a patient undergoes chemotherapy, it is true
that they will recover
Second component of
Credences should conform to the
axioms (or rules) of probability
Second component of Bayesianism:
Credences should conform to the
axioms (or rules) of probability
(A1) All probabilities are between 1 and 0,
(A2) Logical truths have a probability of 1,
i.e. 0 ≤ P(h) ≤ 1 for any h.
i.e. P(T)=1 for any tautology T
(A3) Where h1 and h2 are two mutually exclusive hypotheses, the
probability of h1 or h2 (h1 ∨ h2) is the sum of their respective
i.e. P(h1 ∨ h2) = P(h1) + P(h2).
These are on your handout
The axioms in action
Suppose you draw a marble from an urn:
r = the marble you have drawn is red
~r = the marble you have drawn is not red
Suppose the urn is comprised of 3 red marbles and 7
black marbles
You set
𝑃(𝑟) = .3 (30%)
𝑃(~𝑟) = .7 (70%)
These assignments conform to axiom 1
By axioms 2 and 3, 𝑃(𝑟 ∨ ~𝑟) = 1 (100%)
Arguments for conformity to the
Argument from cases
Lindley draws out rules of probability from the urn example
We can prove other theorems using the axioms and see that they make
sense using the example
E.g. 𝑃 ~𝑟 = 1 − 𝑃 𝑟
Dutch book arguments
Dutch book = a combination of bets which an individual might
accept individually, but which collectively entail that they will
lose money
A Dutch book
If one violates the probability axioms, then they are vulnerable to having
a Dutch book made against them
E.g. suppose you violate A2 or A3 by setting
1. 𝑃 𝑟 = .7
2. 𝑃 ~𝑟 = .5
If you conform to axiom 2, then you do not conform to axiom 3
 By
axiom 2, 𝑃(𝑟 ∨ ~𝑟) = 1
 But
by the above assignments 1 and 2, 𝑃 𝑟 + 𝑃 ~𝑟 = .7 + .5 = 1.2
 So,
contrary to axiom 2, 𝑃 𝑟 ∨ ~𝑟 ≠ 𝑃 𝑟 + 𝑃 ~𝑟 because 1 ≠ 1.2
A Dutch book
If one violates the probability axioms, then they are vulnerable to having
a Dutch book made against them
E.g. suppose you violate A2 or A3 by setting
1. 𝑃 𝑟 = .7
2. 𝑃 ~𝑟 = .5
But if you conform to axiom 3, then you do not conform to axiom 2
 By
axiom 3, 𝑃 𝑟 ∨ ~𝑟 = 𝑃 𝑟 + 𝑃 ~𝑟
 So
by assignments 1 and 2, 𝑃 𝑟 ∨ ~𝑟 = 1.2 = .7 + .5 = 𝑃 𝑟 +
𝑃 ~𝑟
 So,
contrary to axiom 2, 𝑃 𝑟 ∨ ~𝑟 ≠ 1 because 1 ≠ 1.2
A Dutch book
If one violates the probability axioms, then they are vulnerable to having
a Dutch book made against them
E.g. suppose you violate A2 or A3 by setting
1. 𝑃 𝑟 = .7
2. 𝑃 ~𝑟 = .5
If you conform to axiom 2, then you do not conform to axiom 3
But if you conform to axiom 3, then you do not conform to axiom 2
So you cannot conform to the axioms
A Dutch book
Suppose you violate A2 or A3 by setting
1. 𝑃 𝑟 = .7
2. 𝑃 ~𝑟 = .5
Bet 1 for assignment 1
Bet 2 for assignment 2
If r occurs, then they win $3 according to the first bet and lose $5
according to the second, so they lose $2
If r does not occur, then they lose $7 according to the first bet and
gain $5 according to the second, so they lose $2
Either way, they lose $2.
Dutch book argument
If someone violates the probability axioms, then she is
vulnerable to having a Dutch book made against her
One should avoid being vulnerable to having a Dutch
book made against her (because this is a rational
Therefore, one should avoid violating the axioms of
An objection to the second
Conformity to the axioms requires logical omniscience,
but no one is omniscient
“You’re right, but the component only sets an ideal
standard, irrespective whether any one can meet it”
Do you think that one’s credences
should conform to the axioms of
Third component of Bayesianism:
Credences should be updated via
Before examining this component, we need to introduce some terms
Conditional probability
=𝑃 𝑝𝑞
= the probability of p on the condition that q obtains
= the probability of p given q
RATIO formula as an analysis of conditional probability:
𝑃 𝑝𝑞 =
where 𝑃 𝑞 > 0.
Example of a conditional probability
m = Taylor is a mother
f = Taylor is a female
𝑃 𝑚 𝑓 = the probability that Taylor is a mother given that Taylor is a female
𝑃 𝑓 = .5
𝑃 𝑚&𝑓 = .2
𝑃 𝑚&𝑓
𝑃 𝑓
𝑃 𝑚𝑓 =
= .4
Note the big difference between 𝑃 𝑚 𝑓 and 𝑃 𝑓 𝑚
𝑃 𝑚|𝑓 = .4
𝑃 𝑓𝑚 =1
A likelihood = 𝑃 𝑒 ℎ where e represents some
evidence and h a hypothesis.
𝑃 𝑒 ℎ is called the likelihood of h on e.
Prior probabilities
𝑃𝑖 (ℎ) = Your prior probability = “your subjective probability for the hypothesis
immediately before the evidence comes in” (emphasis added)
e = A person, such as Taylor, smiles at you
h = A person, such as Taylor, likes you
~h = A person, such as Taylor, does not like you
𝑃𝑖 (ℎ) = prior probability of a person, such as Taylor, liking you
𝑃 ℎ 𝑒 = probability of a person, such as Taylor, liking you given that s/he smiles at
What is the probability that Taylor likes you given that he or she smiled at you?
𝑃 ℎ𝑒
What is the prior probability that Taylor
likes you?
Suppose you surveyed 100 people and find the following:
What is the probability that Taylor likes
you given the evidence?
P(h|e) = ?
P(h|e) = 9/(9+36) = 9/45 = 1/5 = 20% = .2
Posterior probabilities
What is the probability that Taylor likes you given the evidence?
𝑃𝑖 (ℎ) = Your prior probability = “your subjective probability for the hypothesis immediately
before the evidence comes in” – Michael Strevens(emphasis added)
𝑃𝑓 ℎ = Your posterior probability = “your subjective probability immediately after the
evidence (and nothing else) comes in” (emphasis added)
One should adjust their probability for h from their prior probability 𝑃𝑖 (ℎ) to a posterior
probability 𝑃𝑓 (ℎ) which equals 𝑃 ℎ 𝑒 when having acquired some evidence e (which has a
non-zero initial probability).
This is called conditionalising h on e.
Conditionalisation should occur through Bayes’s theorem (where applicable).
Conditionalisation via Bayes’s theorem
Bayes’s theorem:
𝑃 𝑒 ℎ ×𝑃𝑖 (ℎ)
𝑃 ℎ𝑒 =
𝑃𝑖 (𝑒)
Where 𝑃𝑖 𝑒 = 𝑃(𝑒|ℎ)×𝑃𝑖 (ℎ) + 𝑃(𝑒|~ℎ)×𝑃𝑖 (~ℎ)
Application to the case:
.9 × .1
.2 =
Where .45 = .9 × .1 + .4 × .9
Bayes’s theorem was expressed in a paper by Rev. Thomas Bayes that was published
Arguments for the conditionalization
 Case-by-case
 Bayes’s
theorem is used widely in statistics
 Dutch-book
Part II: Applications and problems
Does God exist?
𝑃𝑖 ℎ = ? (where h = theism)
(One version of) The principle of indifference:
In the absence of evidence favouring one possibility over another, assign each
possibility an equal probability
The principle of indifference seems intuitively plausible in many cases
E.g. all you know is that a prize is behind one of three doors
Presumably the probability that it is behind a given door is 1/3 or approximately .33
Application to theism:
Either ℎ or ~ℎ, so 𝑃𝑖 ℎ = .5
Sounds reasonable right?
Multiple partitions problem
Suppose you’re cooking dinner for Jed, but you don’t know
whether he eats meat
One partition of possibilities: Either 1) Jed is a meat eater h or 2) he is
not a meat eater ~ h, so 𝑃𝑖 ℎ = .5
Another partition of possibilities: 1) Jed is a meat eater h, 2) Jed is a
vegetarian v1 or 3) Jed is a vegan v2, so 𝑃𝑖 ℎ = 1/3
The problem is that the space of possibilities can be partitioned
differently so that it is unclear as to how or whether to apply the
principle of indifference
Application to theism
Either ℎ or ~ℎ, so 𝑃𝑖 ℎ = .5
But what about another partition?
There is no ultimate cause of the universe
Or there is an ultimate cause of the universe, but this cause is not a person (or
conscious being)
Or there is a personal and ultimate cause of the universe, but this cause is not
Or there is a personal, omnibenevolent and ultimate cause of the universe, but this
cause is not omnipotent
Or theism is true
So already 𝑃𝑖 ℎ < 1/5 according to the principle of indifference!
The problem of the priors:
Subjective and objective Bayesianism
We can partition the logical possibilities differently so as to
yield conflicting results when the principle of indifference
So which partition do we go with?
Some think that there is no uniquely correct partition
So how do we determine 𝑃𝑖 ℎ ?
Subjectivists: Well, just pick any value you like – no value is incorrect,
except for perhaps 1 or 0
Objectivists: There is a uniquely correct value for 𝑃𝑖 ℎ , and it is…
Let’s move on and assume that 𝑃𝑖 ℎ = .5, just for illustration
What evidence is there that God
Theistic evidence:
Atheistic evidence:
Fine-tuning of laws and constants
Human suffering
A universe
Animal suffering
Moral truths
Non-resistant, non-belief in God
Miracle reports
Scale of the universe
Abiogenesis (Origins of life)
Contradictory theistic theories
Theism is less simple (Occam’s razor)
The fine-tuning argument
e1 = the laws of the universe are finely tuned to permit meaningful life:
According to philosopher Robin Collins, if the strength of the gravitational
force were to change by one part in 1036, then any land-based or aquatic
organisms the size of humans would be crushed.
𝑃 𝑒1 ℎ) = .5
𝑃 𝑒1 ~ℎ) = 1/1036
Note that I will assume that ~h is equivalent to Western philosophical atheism (rather than also including
polytheism, pantheism, etc.)
What is the posterior probability of theism?
𝑃 ℎ 𝑒1 ) ≈ 1
The fine-tuning argument – Just
e1 = the laws of the universe are finely tuned to permit meaningful life:
According to philosopher Robin Collins, if the strength of the gravitational
force were to change by one part in 1036, then any land-based or aquatic
organisms the size of humans would be crushed.
𝑃 𝑒1 ℎ) = .5
𝑃 𝑒1 ~ℎ) = .01
Note that I will assume that ~h is equivalent to Western philosophical atheism (rather than also including
polytheism, pantheism, etc.)
What is the posterior probability of theism?
𝑃 ℎ 𝑒1 ) ≈ .98
The multiverse objection
If there were (infinitely) many universes with the values of their laws randomly
generated by chance, then we wouldn’t be surprised to see that one of them
happen to have life-permitting values
In Bayesian terms:
Perhaps it is true that a where a = there is an (infinitely) large number of other universes with
values randomly generated by chance and 𝑃 𝑒 ~ℎ&𝑎 = 1 (or some relatively high figure)
The argument from suffering
e2 = humans suffer and this is a bad thing
Missing buses
Now our prior probability relative to e2 is our posterior probability relative to e1, so
𝑃𝑖 ℎ ≈ .98
What are the likelihoods?
Logical argument from evil (J.L. Mackie):
𝑃 𝑒2 ℎ = 0
𝑃 𝑒2 ~ℎ = .5
So, 𝑃 ℎ 𝑒2 ) = 0
The argument from suffering
e2 = humans suffer and this is a bad thing
Missing buses
Now our prior probability relative to e2 is our posterior probability relative to e1, so
𝑃𝑖 ℎ ≈ .98
What are the likelihoods?
Evidential argument from evil (William Rowe):
𝑃 𝑒2 ℎ = .01
𝑃 𝑒2 ~ℎ = .5
So, 𝑃 ℎ 𝑒2 ) = .5
Sceptical theism
“God knows a lot more than us and would have
reasons to justify his actions which we do not know of”
“So if God existed, there was suffering and we did not
see any reason that would justify God’s permission of
the suffering, then we would not be surprised”
More sophisticated defences of versions of sceptical
theism are given by Stephen Wykstra and Daniel
The problem of the priors
There is sometimes a lot of debate about the likelihoods, or at least about what the
relevant likelihoods are
Suppose we agree that:
𝑃 𝑒 ℎ = .9
𝑃 𝑒 ~ℎ = .1
So if we assume that 𝑃𝑖 ℎ = .5
But if we assume that 𝑃𝑖 ℎ = .1
Then 𝑃 ℎ 𝑒) = .9
Then 𝑃 ℎ 𝑒) = .5
And if we assume that 𝑃𝑖 ℎ = .00001
Then 𝑃 ℎ 𝑒) ≈ .0009
The problem of the priors
The problem of the priors
The posterior probability is sensitive to the value of the prior probability
Subjective Bayesians often think that the subjectivity of the prior is not a major
problem since the subjectivity will be “washed out” as evidence accumulates
So two people starting off with different priors will converge on the probable truth
given their conditioning on a growing body of evidence
However, as Alan Hájek notes:
“Indeed, for any range of evidence, we can find in principle an agent whose prior is
so pathological that conditionalizing on that evidence will not get him or her
anywhere near the truth, or the rest of us.”
And there are other worries
So does the problem of the priors render Bayesianism practically useless?
Does it eliminate scepticism about the reliability of inductive inference?
Thank you!