CMU 2012 Ockham`s Razor, Forster

advertisement
So Many Theories of Simplicity!
Which One is Right?
Malcolm R. Forster
Department of Philosophy
University of Wisconsin-Madison
June 23, 2012
1
Abstract
• For a long time, many assumed that simplicity is a sign of
truth because the world is simple.
• Popper complicated the story by saying that simplicity is
falsifiability, so a simpler theory can be severely tested (and if
it passes those tests, it is well corroborated).
• Statisticians have pointed out that simpler models have
fewer parameters, in which case each can be more accurately
estimated, which leads to more accurate predictions.
• Now Kevin Kelly has taught us how we more quickly
converge towards the truth if we examine simpler theories
first.
• Which theory of simplicity is right? I want to say that all of
them are mutually compatible, and most of them are right.
2
What I will actually talk about!
What were the original examples in the history of
science that motivated principles of parsimony and
Ockham’s razor?
An early example was Copernicus’s appeal to
harmony as an argument against Ptolemy’s earthcentered system of planetary astronomy.
How should we view this example, and other real
scientific examples, in light of what we have learned
about simplicity in recent years?
3
Disparate Examples of the Same Thing?
Example 1: Copernicus versus Ptolemy
Example 2: The time asymmetry of cause and effect.
Example 3: Kepler versus Copernicus
Example 4: The time asymmetry of cause and effect.
4
E  m a
2
E  mb
2
Copernicus (1473 - 1543)
Sun-Centered Theory of
the Planetary Motion
“We thus follow Nature, who producing nothing in vain
or superfluous often prefers to endow one cause with
many effects. Though these views are difficult, contrary
to expectation, and certainly unusual, yet in the sequel we
shall, God willing, make them abundantly clear, at least
to the mathematicians.”
---De Revolutionibus, Book I, Chapter 10.
6
Copernicus versus Ptolemy
Mars
Mars
Earth
Sun
A Copernican submodel for Mars
Earth
Corresponding Ptolemaic
sub-model for Mars
7
Two-Circle Copernican Model for Mars
epicycle
Mars
deferent
Sun
Number of circles  Copernican model;
Radii, rates of motion, etc., are adjustable within the model.
8
Copernicus versus Ptolemy
Mars
Mars
Earth
Sun
Another Copernican
sub-model for Mars
Earth
The corresponding
Ptolemaic sub-model for
9
Mars
Copernicus versus Ptolemy
Everyone has heard the expression “adding
epicycles” used in a disparaging way.
A common misconception is that Ptolemy used
epicycles and Copernicus did not, therefore
Copernicus’s theory was simpler than Ptolemy’s.
FALSE!
Note the title of this demonstration!
http://www.youtube.com/watch?v=QVuU2YCwHjw
10
Copernicus versus Ptolemy
The point of the Homer Simpson demonstration is that
both Ptolemy’s and Copernicus’s theories can fit any
data to any degree of precision. (Pure accommodation)
So why did Copernicus supersede Ptolemy?
Maybe Copernicus predicted some things better than
Ptolemy?
Problem: If we focus just on one planet at a time, then
for any Copernican model, there exists a corresponding
Ptolemaic model that is empirically equivalent.
11
The ‘anything you can do I can do
just as well’ problem!
12
Common Responses to the Problem
Copernicus’s theory is more explanatory than
Ptolemy’s theory.
Copernicus’s theory is more harmonious, simpler, or
more unified than Ptolemy’s theory.
Copernicus’s theory is more falsifiable than
Ptolemy’s theory.
Copernicus’s theory makes predictions that
Ptolemy’s does not.
13
Copernicus versus Ptolemy
I want to argue :
1. (Popper’s falsifiability criterion.) Copernicus’s
theory was predictively more powerful. (E.g. the
phases of Venus.)
2. At least some of these predictions were known to
be true in Copernicus’s time? (Contra Popper!)
3. The reason Copernicus was predictively more
powerful was that it was simpler, in some sense.
14
Copernicus versus Ptolemy
In what sense is Copernicus’s theory simpler than
Ptolemy’s theory?
15
Copernicus versus Ptolemy
Take any Copernican model (that is, a version of
Copernicus's theorem with a fixed number of circles for
each celestial body), and then consider the corresponding
Ptolemaic model such that the submodel for each celestial
body individually is empirically equivalent (there is only
one way of doing this).
Each submodel is empirically equivalent because it simply
adds the same circles (or scaled versions of them) in a
different order.
But the Copernican model will have fewer circles overall
because Ptolemy has to "make" copies of the circles for the
earth-sun component of the apparent motion of each planet.
16
Copernicus versus Ptolemy
Jupiter
Mars
Sun
Copernican model (3 circles)
17
Ptolemy
Earth
Mars
Earth
Sun
Jupiter
Earth
The corresponding Ptolemaic model
18
Copernicus’s Prediction of a Coincidence
Retrograde motion:
One known fact not
predicted by any Ptolemaic
model: The retrograde
motion of a superior planet
occurs only if the planet is in
opposition to the sun.
Mars
Earth
Sun
19
Copernicus versus Ptolemy
Recall, the Problem is that Copernicus and Ptolemy
make the same predictions if we consider only one
celestial body at a time.
If we consider two or more celestial bodies at one
time, then Copernicus’s theory does make predictions
that Ptolemy’s theory does not make!
Moreover, some of these predictions were determined
to be true by the data known in Copernicus’s time.
I also think that this was due to the greater unification
of Copernicus’s theory.
20
Question
Could we arrive at Copernicus’s theory in a data
driven way?
That is, could Ptolemy have noticed that epicycles
in the submodel of Mars and Jupiter and Saturn
have the same period as the circles for the SunEarth motion, and strengthened his theory by
saying they are the same thing, giving Ptolemy+?
Yes, but Ptolemy < Ptolemy+ < Copernicus.
21
So, what’s wrong with Ptolemy+
Why is Ptolemy+ < Copernicus?
Because it does not require that the blue circles are
literally the same circles (same radius as well as
period!).
There is something more than instrumentalism at
work in this example.
We have take seriously Copernicus’s talk of causes,
where the postulated causes represent some aspect
of reality behind the phenomena.
22
Why is Copernicus better than Ptolemy+
Given any Copernican model, the corresponding
Ptolemy+ model has the same degree of
confirmation with respect to the data known in
Copernicus’s time.
That is, it is empirically equivalent with respect to
the prediction of the angular positions of all
celestial bodies.
But, Ptolemy+ does not put the Sun at the center of
planetary orbits, does not predict the phases of
Venus, and so on.
23
Why is Copernicus better than Ptolemy+
That is why we must consider fruitfulness as a
criterion for theory choice in addition to
confirmation with respect to current data.
Copernicus’s theory not only makes some novel
predictions that are known to be true, but is also
predictively more powerful in ways that have not
yet been checked.
Perhaps fruitfulness can also be given an
instrumental justification, but it is certainly not
entirely data driven, which is my main point.
24
Copernicus (1473 - 1543)
Sun-Centered Theory of
the Planetary Motion
“We thus follow Nature, who producing nothing in vain
or superfluous often prefers to endow one cause with
many effects. Though these views are difficult, contrary
to expectation, and certainly unusual, yet in the sequel we
shall, God willing, make them abundantly clear, at least
to the mathematicians.”
---De Revolutionibus, Book I, Chapter 10.
25
William of Ockham, England (~1280 – 1347 AD)
Occam’s Razor, or the Law of Parsimony
“Entities must not be multiplied beyond necessity.”
“What can be explained by the assumption of fewer things
is vainly explained by the assumption of more things.”
“Plurality is not to be posited without necessity.”
26
FORWARD CAUSAL MODEL
P(win |1euro)  
Alice
loss
win
1 euro
90
10
2 euros
80
20
Bob
loss
win
1 euro
90
10
2 euros
160
40
P(win | 2euro)  
ˆ  0.1
ˆ  0.2
ˆ  0.1
ˆ  0.2
Independent measurements agree.
27
BACKWARD CAUSAL MODEL
P(2euro | loss)  
Alice
loss
win
1 euro
90
10
2 euros
80
20
Bob
loss
win
1 euro
90
10
2 euros
160
40
P(2euro | win)  
ˆ  0.47
ˆ  0.67
ˆ  0.64
ˆ  0.80
Independent measurements do NOT agree.
28
The Asymmetry of Regression
y
• The data are generated
from y = x + u, where x is
N(–10,1), u is N(0,1) and
u is independent of x.
-6
-7
• The y on x regression is
different from the x on y
regression.
-14
-8
-9
-12
-8
-6
x
-11
-12
-13
-14
29
A Difference in Direction
y
• In the forwards regression
model predicts the second data
set (top right) better than the
backwards regression model.
10
5
-10
-5
5
10
x
-5
-10
30
What do these disparate
examples have in common?
How is the Forward Causal Model similar to Copernicus?
The postulated “cause” is not an event. It is represented by a
forward conditional probability.
We do not write:
PAlice (win |1euro)  
PAlice (win | 2euro)  
PBob (win |1euro)  
PBob (win | 2euro)  
(What would we predict if Carol played?)
31
Kepler versus Copernicus
In the Copernicus versus Ptolemy example, the
point has nothing to do with overfitting. (Copernicus
makes novel predictions.)
Kepler replaced Copernicus’s circles on circles with
a single ellipse for each planet.
The Homer Simpson demonstration makes the point
that Copernicus can accommodate anything.
But if there are as many adjustable parameters as
data points, then it can’t be trusted for prediction.
32
Kepler versus Copernicus
There is no possibility that Kepler can accommodate
a Homer Simpson orbit, because there is only a
single ellipse (with a small handful of adjustable
parameters).
Kepler tried 19 different oval shaped curves, and
decided on the ellipse.
Perhaps this was a lucky choice?
33
Kepler’s Model
Mars
ellipse
Sun
There is only one Keplerian model; initial positions,
eccentricity, size, rate of motion, etc., are adjustable.
34
Kepler’s Third Law
The ratio of the mean radius of a planet’s orbit cubed
to the period squared is the same for all planets.
For Newton, these ratios provided six independent
measurements of the sun’s mass.
35
Newton’s Apple
A story about the
unification of celestial
and terrestrial motion;
of Kepler’s law
applied to the moon,
and Galileo’s theory
of projectile motion.
36
The Concept of Acceleration
Galileo’s thought experiment for
his law of circular inertia.
Galileo thought of uniform circular motions as
inertial, therefore the moon was outside the influence
of the earth’s gravity.
3
7
Newton’s Innovative
Conception of
Acceleration…
v
v
v + v
Earth
…as a quantity with
direction and magnitude
(a vector).
38
Newton’s Consilience of Kepler’s and
Galileo’s Inductions
The agreement of independent measurements of
the Earth’s gravitational mass…
…explains celestial and terrestrial motions as
effects of a common cause…
…rather than effects of two separate causes.
39
Newton’s Rules of
Hypothesizing
RULE I: We are to admit no more causes of natural
things than are both true and sufficient to explain their
appearances.
To this purpose the philosophers say that Nature does
nothing in vain, and more is in vain when less will
serve; for Nature is pleased with simplicity, and
affects not the pomp of superfluous causes.
40
RULE II: Therefore to the same natural effects
we must, as far as possible, assign the same
causes.
EXAMPLE: Newton assigned the same cause to the
motion of the Jupiter’s moons and the displacement
of the center of the orbits of the planets outside
Jupiter–namely, the gravitational influence of
Jupiter, as measured by its gravitational mass.
41
To sum up…
Some of the arguments from simplicity have
something to do with minimizing sampling error
and avoiding overfitting (e.g. Kepler’s solution to
the Homer Simpson problem).
Others, such as Copernicus’s “argument from
harmony” and Newton’s argument for universal
gravitation have to do with postulating one cause
for many effects.
42
The End
Download