Bayesian Decision Theory

advertisement

Bayesian Decision Theory

Foundations for a unified theory

1

What is it?

• Bayesian decision theories are formal models of rational agency, typically comprising a theory of:

– Consistency of belief, desire and preference

– Optimal choice

• Lots of common ground…

– Ontology : Agents; states of the world; actions/options; consequences

– Form : Two variable quantitative models ; centrality of representation theorem

– Content : The principle that rational action maximises expected benefit.

2

It seems natural therefore to speak of plain Decision

Theory . But there are differences too ...

e.g. Savage versus Jeffrey.

– Structure of the set of prospects

– The representation of actions

– SEU versus CEU.

Are they offering rival theories or different expressions of the same theory?

Thesis : Ramsey, Savage, Jeffrey (and others) are all special cases of a single Bayesian Decision Theory

(obtained by restriction of the domain of prospects).

3

• Introductory remarks

– Prospects

– Basic Bayesian hypotheses

– Representation theorems

Plan

• A short history

– Ramsey’s solution to the measurement problem

– Ramsey versus Savage

– Jeffrey

• Conditionals

– Lewis-Stalnaker semantics

– The Ramsey-Adams Hypothesis

– A common logic

• Conditional algebras

• A Unified Theory (2 nd lecture)

4

Types of prospects

• Usual factual possibilities e.g. it will rain tomorrow; UK inflation is 3%; etc.

– Denoted by P, Q, etc.

– Assumed to be closed under Boolean compounding

• Conjunction: PQ

• Negation: ¬P

• Disjunction: P v Q

• Logical truth/falsehood:

T ,

• Plus derived conditional possibilities e.g. If it rains tomorrow our trip will be cancelled; if the war in Iraq continues, inflation will rise.

– The prospect of X if P and Y if Q will be represented as

(P→X)(Q→Y)

5

Main Claims

• Probability Hypothesis : Rational degrees of belief in factual possibilities are probabilities.

• SEU Hypothesis : The desirability of (P→X)(¬P→Y) is an average of the desirabilities of PX and ¬PY, respectively weighted by the probability that P or that ¬P.

• CEU Hypothesis : The desirability of the prospect of X is an average of the desirabilities of XY and X¬Y, respectively weighted by the conditional probability, given X, of XY and of X¬Y.

• Adams Thesis : The rational degree of belief to have in

P→X is the conditional probability of X given that P.

6

Representation Theorems

• Two problems; one kind of solution!

– Problem of measurement

– Problem of justification

• Scientific application : Representation theorems shows that specific conditions on (revealed) preferences suffice to determine a measure of belief and desire.

• Normative application : Theorems show that commitment to conditions on (rational) preference imply commitment to properties of rational belief and desire.

7

Ramsey-Savage Framework

1. Worlds / consequences: ω

1

, ω

2

, ω

3

, …

2.

Propositions / events: P, Q, R, …

3.

Conditional Prospects / Actions: (P→ω

1

)(Q→ω

2

), …

Good egg Rotten egg

Break egg 6-egg omelette Nothing to eat

Throw egg away 5-egg omelette 5-egg omelette

4. Preferences are over worlds and conditional prospects.

“If we had the power of the almighty … we could by offering him options discover how he placed them in order of merit

…“

8

Ramsey’s Solution to the Measurement

Problem

1. Ethically neutral propositions

• Problem of definition

• Enp P has probability one-half iff for all ω

1 and ω

2

(P→ω

1

)(¬P→ω

2

)

 (¬P→ω

1

)(P→ω

2

)

2. Differences in value

• Values are sets of equi-preferred prospects

• 

β  γ – δ iff (P→ 

) (¬P→δ)  (P→ β )(¬P→γ)

9

3. Existence of utility

Axiomatic characterisation of a value difference structure implies that existence of a mapping from values to real numbers such that:

β = γ – δ iff U( 

) – U(β) = U(γ) –U(δ)

4. Derivation of probability

Suppose δ 

(

 if P)( β if ¬P). Then:

Pr( P )

U (

)

U (

)

U (

)

U (

)

10

Evaluation

• The Justification problem

– Why should measurement axioms hold?

– Sure-Thing Principle versus P4 and Impartiality

• Jeffrey’s objection

– Fanciful causal hypotheses and artifacts of attribution.

– Behaviourism in decision theory

• Ethical neutrality versus state dependence

– Desirabilistic dependence

– Constant acts

11

Break egg

Throw egg away

Miracle

Topsy Turvy

Utility Dependence

Good egg

6-egg omelette

None wasted

5-egg omelette

1 egg wasted

Good egg

6-egg omelette

None wasted

Nothing to eat

5 eggs wasted

Rotten egg

Nothing to eat

5 eggs wasted

5-egg omelette

None wasted

Rotten egg

6-egg omelette

None wasted

6-egg omelette

None wasted

12

Probability Dependence

Republican

Dodgy land deal Low taxes

Unrestricted development

Democrat

High taxes

Restricted development

No deal No development No development

Miracle deal High taxes

Restricted development

Low taxes

Unrestricted development

13

Jeffrey

• Advantages

– A simple ontology of propositions

– State dependent utility

– Partition independence (CEU)

• Measurement

– Under-determination of quantitative representations

– The inseparability of belief and desire?

– Solutions: More axioms, more relations or more prospects?

• The logical status of conditionals

14

Conditionals

• Two types of conditional?

– Counterfactual : If Oswald hadn’t killed Kennedy then someone else would have.

– Indicative : If Oswald didn’t kill Kennedy then someone else did

Two types of supposition

– Evidential : If its true that …

– Interventional : If I make it true that …

[Lewis, Joyce, Pearl versus Stalnaker, Adams,

Edgington]

15

Lewis-Stalnaker semantics

Intuitive idea : A □ →B is true iff B is true in those worlds most like the actual one in which A is true.

Formally : A □ →B is true at a world w iff for every A¬B world there is a closer AB -world (relative to an ordering on worlds).

1. Limit assumption : There is a closest world

2. Uniqueness Assumption : There is at most one closest world.

16

The Ramsey-Adams Hypothesis

• General Idea : Rational belief in conditionals goes by conditional belief for their consequents on the assumption that their antecedent is true.

• Adams Thesis : The probability of an (indicative) conditional is the conditional probability of its consequent given its antecedent:

(AT) p ( A

B )

 p ( B | A )

• Logic from belief : A sentence Y can be validly inferred from a set of premises iff the high probability of the premises guarantees the high probability of Y .

17

A Common Logic

1. AB  A

B  A

B

2.

 

A

A

3.

A

A

 

4.

A

 ¬ A

 

5. A

B

A

AB

6. (A

B)(A

C)

A

BC

7. (A

B) v (A

C)

A

(B v C)

8.

¬ (A

B)

A

 ¬B

18

The Bombshell

• Question : What must the truth-conditions of A

B be, in order that Ramsey-Adams hypothesis be satisfied?

• Answer : The question cannot be answered.

Lewis, Edgington, Hajek, Gärdenfors, Döring , …: There is no nontrivial assignment of truth-conditions to the conditional consistent with the Ramsey-Adams hypothesis.

• Conclusion :

1.

“few philosophical theses that have been more decisively refuted” – Joyce (1999, p.191)

2.

Ditch bivalence!

19

A

B

A

C

Boolean algebra

B

C

A

B C

20

A

B

A

A

C

B

Conditional Algebras (1)

A

C

A

C

B

C

A

C

A A

C

C

C

A

C

AC

(X

Y)(X

Z)

X

YZ

(X

Y) v (X

Z)

X

(Y v Z)

21

A

B

A

A

C

B

C

Conditional algebras (2)

A

C

A

C

B

C

A

C

A A

C

C

A

C

AC

XY  X

Y

22

A

B

A

A

C

B

C

Conditional algebras (3)

A

C

A

C

B

C

A

C

A A

C

C

A

C

AC

X

Y  X

Y

23

A

B

A

Normally bounded algebras (1)

A

C

A

C

A

C B

C

A

C

A A

C

C

B C

A

C

AC

X

X

 

X

Y

X

XY

24

A

B

A

A

C

B

C

B

C

Material Conditional

A

C

A

C

A

C

A A

C

C

A

C

AC

X

   ¬X

25

A

B

A

Normally bounded algebras (2)

A

C

A

C

A

C B

C

A

C

A A

C

C

B C

A

C

AC

X

 ¬X  

¬(X 

Y)

X

 ¬Y

26

Conditional algebras (3)

A

B

A

C

A

A

A

C

B

B

C

A

C

C

C

27

Download