10 choosing a model equation.pptx

advertisement
10. Choosing the f()
Eμ  f(X ,, X n , β1 , ...,β m )
CH1. What is what
CH2. A simple SPF
1
CH3. EDA
CH4. Curve fitting
CH5. A first SPF
CH6: Which fit is fitter
CH7: Choosing the objective function
CH8: Theoretical stuff (skip)
Ch9: Adding variables.
CH10: Estimate accuracy (missing)
CH10. Choosing a model equation
Now that we discussed
the choice of objective
function and addition of
variables, we focus on
the function in which
variables combine
In this Session (mainly):
1. What functions to try?
2. Going for a better fit.
3. What functions look like.
4. Adding ‘Terrain’ as variable.
5. The modeling process.
1
Eμ  f(X1 ,, X n , β1 , ...,β m )
variables
Parameters
Determining what function hides
behind the noisy data is key to getting
good estimates of Em and sm.
SPF workshop February 2014
2
Aristotle
Why “key to...”?
“Acceleration of free falling body ∝ its weight
One could use data to estimate the
proportionality constant.
This f() would give poor predictions
and be of no practical use.
384-322 B.C.
Galileo
With Aristotle’s f()
space travel
would not work.
SPF workshop February 2014
1564-1642
3
“Aristotle maintained that
women have fewer teeth than
men; although he was twice
married, it never occurred to
him to verify this statement
by examining his wives'
mouths.”
Bertrand Russell
SPF workshop February 2014
4
Functions are primary, parameters secondary
β
L
So far we used to represent the contribution of L.
1
In the future we will try L  β1L and other functions
2
The two β1’s make different contributions to E{μ).
Moral: Parameters get their meaning from the
function in which they feature.
If so, why so little
attention to getting
the function right?
SPF workshop February 2014
5
Why so little attention to f()?
Generic
Eμ  f(X1 ,, X n , β1 , ...,β m )
Many modellers state, without giving
X
reasons, that e
is the ‘f’, and
proceed to parameter estimation.
Why?
i
i
SPF workshop February 2014
6
The path:
From
To
Eμ  f(X1 ,, X n , β1 , ...,β m )
E μ =e
∀i β i X i
In 4 questionable steps
As extracted from the
Handbook of econometrics
SPF workshop February 2014
7
From
1
Eμ  f(X1 ,, X n , β1 , ...,β m )
For historical and convenience in estimation
Σfi(X1, X2, …)βi
2
(Only linear in parameters)
It is desirable to identify effects separately
Σfi(Xi)βi
3
Same f “for ease of computation and
interpretation and for aesthetic reasons.”
Σf(Xi)βi
4
Assume a specific ‘f
f(Xi)= Xi
To E μ = e
SPF workshop February 2014
∀i β i X i
8
“The computer models
of economists have to use
equations that represent human
behaviour; by common consent,
they do it amazingly badly”
Jon Turney, A model world, December 16, 2013
SPF workshop February 2014
9
In SPFs
From
Eμ  f(X1 ,, X n , β1 , ...,β m )
To
E μ = β0 × f1 (X1 , 𝛃1 )× f2 (X2 , 𝛃2 )×…
Multiplicative and (usually) single-variable factors
But



Not all functions the same
Not linear in parameters
Functions not preselected
(Not E μ = e ∀i β i X i )
SPF workshop February 2014
10
Finding the right f() is difficult
I will give you data about Y, X1, and X2
You find f in Y=f(X1, X2)
Object
Measurements
1
2
X1 [m]
2.06
7.64
…
99
100
4.37
10.03
X2 [m]
7.91
5.51
Y [m]
8.18
10.10
3.60
5.66
4.08
10.77
SPF workshop February 2014
11
EDA
It looks like Y=β0+β1X1+ β2X2 should do
The parameter estimates
were 0.35, 0.73 and 0.66
Good correspondence!
12
The elusive f()
Reality
data
Y
X1
X2
The researcher choose Y=β0+β1X1+ β2X2

But should have chosen Y  X  X
β1
1

β 2 β3
2
Moral 1: This β (0.73)has nothing to do with this one (1.98)
13
Moral 2: f() is nearly unfathomable.
Few would have chosen Pythagorean model
equation if the theory was not known.
My students didn’t.
Moral 3: Is Occam’s razor good advice?
SPF workshop February 2014
14
Moral 4: However reasonable the choice of
Y=β0+β1X1+ β2X2 ,
it is wrong to say that when X1 is increased by 1m then Y
will increase Y by β1= 0.73 m.
If X1<<X2 then increase in Y is close to 0;
if X2<<X1 then increase in Y is close to 1.
The regression parameter can tell the result of a
manipulation only when f() is right
Implication for SPFs & CMFs
If we do not know what the true f() is, regressions may
predict well what E{μ} is, but cannot be trusted to
predict how E{μ} will change if a variable is changed.
15
Moral 5: The models we use are either additive or
multiplicative and made up of single-variable building
β
β β
blocks. But Y  X1  X 2 is neither.

1
2

3
So, even simple phenomena may not be represented
by commonly used model forms.
In sum, f() is elusive.
If the aim is to get the CMF, f() must be right.
If the aim is to get good estimates of E{μ}, f() does
not have to be right.
SPF workshop February 2014
16
Searching for suitable f()’s
So far we used E μ = β0 [1 − βslope (Year − 1986)] Lβ 1 AADT β 2
The ’s were estimated as if the function was the right.
Is it?
Very, very unlikely.
There is no theory behind this f()
The only guides are:
a) Parsimony of parameters
b) Quality of fit
c) EDA
If f() is not right, how may the parameters be used?
SPF workshop February 2014
17
Searching for suitable f()’s
The tools for finding the right f()
are not well developed.
As was shown, without a theory even a
simple f() is difficult to find.
The Modellers
Tantalus's punishment was
‘temptation without satisfaction’
SPF workshop February 2014
18
Other functions to be tried, e.g.:
Power
Polynomial
Hoerl
...
Mixtures
Xβ
X+X2
X β1 eβ2 X
Which will fit better?
To choose well one has to know what functions look like
SPF workshop February 2014
19
What do functions (equations) look like?
A visualization tool.
Open #14: ‘Visualise functions.xlsx’ on ‘The Tool’ workpage
Three panels: Basic, Modifier, and Composite
The ordinates
The parameters
The argument
(abscissa)
20
This is what the ‘basic’ functions look like
21
2. Polynomial
1. Power
0
0.5
1
X
1.5
2
0
0.5
1
3. Logistic
1.5
X
2
0
0.5
1
1.5
2
X
Can you make the ‘power’ function bend down?
Can you give the ‘polynomial’ a maximum?
Lower the ‘logistic’ at x=0.5 while keeping its value at x=1
SPF workshop February 2014
22
Three panels: Basic, Modifier, and Composite
The ordinates
The parameters
SPF workshop February 2014
23
SPF workshop February 2014
24
This is what the ‘modifier’ functions look like
25
The ‘Composite’ functions
Modifier
Hoerl=Power*Exponential
Logistic*Linear
SPF workshop February 2014
26
What composite functions look like
Power & Exponential (Hoerl)
a. Exponential
1. Power
1.2
1.0
0.8
×
0.6
=
0.4
0.2
0.0
0
0.5
1
X
1.5
2
0
0.5
1
1.5
X
2
0
0.5
1
1.5
2
X
Can you make Hoerl loose its peak?
SPF workshop February 2014
27
Comparing fits – a hitch
One can usually improve the fit by using
functions with more parameters. In the limit ....
Same number of parameters;
Fits can be compared.
Not the same number of parameters;
How then to compare fits?
SPF workshop February 2014
28
The danger of ‘overfitting’.
What to do?
1. SSD must be larger than the sum of fitted values.
2. By AIC (Akaike Information Criterion): Add
parameter if it increases Ln(maximized likelihood)
by more than 2.7.
3. By BIC (Bayesian Information Criterion): Add
parameter if Ln(maximized likelihood) is increased
by more than Ln(Number of data points)/2
SPF workshop February 2014
29
Trying for a better fit
We used E μ = β0 [1 − βslope (Year − 1986)Lβ 1 AADTβ 2
Would it be better if 𝐿𝛽1 was replaced by 𝐿 + 𝛽1 𝐿2 or if
AADTβ 2 eβ 3 AADT was used instead of AADTβ 2 ?
Open :#15 ‘Base for fit improvements’ on ‘Power’ workpage
This is where we left off
SPF workshop February 2014
30
Now, still on #15 go to ‘Polynomial’ workpage
Replace by (B8+$CB$2*B8^2) and copy down.
That’s all. Now use ‘SOLVER’.
31
Model equation
Power
Polynomial
Increase
Log-Likelihood
-26329.0
-26311.5
17.5
Increase in likelihood
with no addition of
parameters is good.
Caution: Polynomials are risky
Even though log-likelihood increased by a factor
of e17.5 the CURE plot did not change much.
SPF workshop February 2014
32
,
In praise of the ‘Solver’
Changing the functional form was straightforward.
We replaced
by
Note: mixes addition and multiplication
but ‘Solver’ did not choke!
Modellers tend to use only linearized
expressions in which addition and
multiplication do not mix. Why?
33
Recall:
Residual ≡ Observed - Fitted
In origin to A fitted is too small;
The only way to increase it is to allow positive intercept.
SPF workshop February 2014
34
Add Intercept
Still on #15 go to ‘Add intercept’ workpage
Increase in log-likelihood=?
Justified by AIC, BIC?
Practically important?
SPF workshop February 2014
35
We are still “Trying for a better fit”
For AADT I replaced ‘Power’ by the sigmoids:
‘Logistic’, ‘Weibull’ and ‘Hoerl’
Similar loglikelihoods
but differing
predictions
when
AADT>10,000
SPF workshop February 2014
36
Footloose!
Which of the many alternative functions to choose?
(The uncertainty due to this source is never considered)
Reporting
𝐴𝐴𝐷𝑇 0.939
𝐸 𝜇 = 0.007 + 0.173[1 − 0.020 𝑌𝑒𝑎𝑟 − 1986)] 𝐿 + 0.066𝐿2 )(
)
1000
2
𝐸 𝜇
𝑉 𝜇 =
2.126(𝑆𝑒𝑔𝑚𝑒𝑛𝑡 𝐿𝑒𝑛𝑔𝑡ℎ}
SPF workshop February 2014
37
Moral: Choice of function matters.
The estimate of E{μ) may depend strongly on
what function the modeler chooses.
Therefore, uncertainty of prediction is not only
(mainly?) a matter of statistical inaccuracies.
SPF workshop February 2014
38
CURE plots still bad.
What to do?
SPF workshop February 2014
39
Adding the ‘Terrain’ variable
For each segment we have data about ‘Terrain’ (F, R or M).
Terrain is a proxy for ‘grade’, ‘curvature’ etc. for which we
do not have data. Is ‘Terrain’ safety-relevant?
VIEDA (Pivot)
Terrain
Flat
Mountainous
Rolling
Grand Total
Observed
Accidents
1882
11273
8563
21718
Fitted
Values
3480.2
8609.6
9628.2
21718.0
Observed/
Fitted
0.54
1.31
0.89
1.00
Illustrates bias-in-use if ‘Terrain’ is not in model
40
Option 1. Add terrain by two multiplier parameters
Open #16. NB fit with terrain multipliers
Terrain added
to data
Added column for
terrain multiplier
41
Add two parameters
Click ‘SOLVER’
Note L + β1 L2
SPF workshop February 2014
42
The evolution of the  of L
Initially: E{m}∝L
Objective Function
Weighted LS
Poisson Likelihood
NB Likelihood
Absolute differences
Chi Squared
Total Absolute Bias
Power of L
0.87
When only L in model
0.86
0.87
0.91
0.74
0.74
After AADT added in L =1.08 and in L+L2 it is 0.076
After ‘Terrain’ added 0.005 in L+L2
Back to
SPF workshop February 2014
43
Lessons:
1. Modeling is a search involving trial and error
2. It is not clear when the search should end
3. All conclusions are provisional
SPF workshop February 2014
44
Option 2. Fit separate models using data by terrain
intercept
0
slope
1
2
Terrain
Flat
0.006 0.073 -0.0005 0.122 0.941
Rolling
0.007 0.186 -0.016 0.0007 0.850
Mountainous 0.029 0.289 -0.020 0.0004 0.860
Sum
𝒷
Log-Lik.
2.333
4.983
1.846
-4755
-13393
-7635
-25741
Which option to choose, multipliers or separate models?
Likelihood increased by 134 (from -25875 to -25741)
but parameters increased by 12.
Should we test the hypothesis of no difference?
SPF workshop February 2014
45
Is there a practically
significant difference?
Yes!
SPF workshop February 2014
46
An unexpected turn
Adding terrain
did not cure
the CURE
But...
47
SPF workshop February 2014
48
The modeling process
SPF workshop February 2014
49
Summary for section 10. (Choosing a model equation)
1. Finding function behind the data is important; don’t
just assume it. Parameter estimation secondary.
2. Considerations: EDA, Simplicity and fit, not theory.
3. Overfitting and the AIC-BIC crutch.
4. We tried a few (Power, Polynomial, Hoerl).
5. The ‘Solver’ did not choke.
6. The choice of function was seen to matter.
7. Uncertainties are not mainly statistical; model
equations are not laws of nature.
SPF workshop February 2014
50
8. What do various equations look like?
9. How to adapt the C-F spreadsheet to ‘basic’ &
‘multiplier’.
10. Two ways of adding ‘Terrain’.
11. Depicting the SPF modeling process.
SPF workshop February 2014
51
In closing
Objectives:
1. How to develop SPFs using a spreadsheet
2. To promote understanding
What are SPFs for? This determines the direction.
Can they deliver SPFs? I doubt it.
Building blocks (How to do EDA, how to fit
function,...) and Tools (Pivot Table, Solver, CURE,...)
Snakes and Ladders modeling
SPF workshop February 2014
52
We discussed elements of modeling with road safety data
Good modeling requires a thoughtful modeller
You are it
53
Download