6. which fit is fitter.pptx

advertisement
6. Which fit is fitter
CH1. What is what
CH2. A simple SPF
CH3. EDA
CH4. Curve fitting
CH5. A first SPF
CH6: Which fit is fitter
CH7: Choosing the objective function
CH8: Theoretical stuff
Ch9: Adding variables
CH11. Choosing a model equation
In this session:
1. What makes for a good fit
2. Introducing the CURE plot
3. Eliminating ‘overall bias’
4. The bias of a fit
1
5. Using the CURE plot
What makes for a good fit?
Common ‘goodness-of-fit’ measures: R2, χ2, AIC,...
These are ‘overall’ (single-number) measures.
For application SPF they are insufficient. Recall…
Two perspectives on SPF
E{m} and s{m} = f(Traits, parameters)
Cause and effect centered
perspective
Applications centered
perspective
2
• One judges the fit of a model by its residuals.
• In SPFs for applications a fit is thought good only if the
residuals are closely packed around 0 everywhere.
Perhaps acceptable
SPF Workshop February 2014, UBCO
3
But this one is not!
Fitted is too small
20
Fitted is too large
0
0
10
20
30
40
50
60
-20
Variable value
70
80
90
Residual: Observed - Fitted
40
100
-40
The main figure of merit for SPFs: Unbiased Everywhere
SPF Workshop February 2014, UBCO
4
The usual residual plot
Informative?
SPF Workshop February 2014, UBCO
5
But, when the same residuals are cumulated
From spreadsheet
Compute Residual → Cumulate → Plot
SPF Workshop February 2014, UBCO
6
Residual: Observed - Fitted
The CURE Plot
Now one can see!
0-A, B-C, E-F: Observed>Fitted, not good;
A-B, D-E, Fitted>Observed, bad;
Where the drop is precipitous there may be outliers.
SPF Workshop February 2014, UBCO
7
Benefits:
1. Chaos is replaced by clarity.
2. We can recognize a good model.
3. The cost of parameterization is clear..
(2) What should a good CURE plot look like?
•Should not have long up or down runs
•Should not have vertical drops
•Should meander around the horizontal axis
SPF Workshop February 2014, UBCO
8
(3) The cost of parametric curve fitting is now manifest
Imposing the function 1.675×(Segment Length)0.866 on the
data causes bias almost everywhere!
Biased estimates
No bias
Bad decisions
Real costs
SPF Workshop February 2014, UBCO
9
How much bias is there?
Accumulated
Fitted
Bias
Accidents Accidents
Origin to A
1899
1596
303
A to B
B to C
...
Bias/
Fitted Accident
0.19
854
1532
-688
-0.44
...
...
...
...
TAB=Total Accumulated Bias =303+|-688|+...
SPF Workshop February 2014, UBCO
10
Levelling the playing field
Open spreadsheet #7. OLS with constraint
When the scale parameter is determined by ‘Solver’
the sum of fitted values is usually not the same as the
sum of crash counts. This is a blemish.
To remove this blemish, add constraint
11
How to add constraints
click
12
Now click ‘Solve’
to get
With
constraint
SPF Workshop February 2014, UBCO
13
When is a CURE plot good enough?
Open (again): #7 OLS with constraint
After SOLVER with constraint was used you should now see:
Open: #8 CURE computations
Copy values in columns A, B, D and E into CURE spreadsheet
SPF Workshop February 2014, UBCO
14
Copied
Important step:
On ‘DATA’ tab
choose ‘Sort’ and
sort in ascending
order by ‘miles’
SPF Workshop February 2014, UBCO
15
Now add columns E, F, and G,
Note that for the last row (n=5323) the Cumulated
Residuals=0. Why?
F3+E4
C4-D4
SPF Workshop February 2014, UBCO
16
Below is a plot of segment length (column B)
against cumulative residuals (column F)
Upward drift means that in this range ‘observed’
tends to be consistently larger than ‘fitted’.
0
1
Segment Length
2
3
Truncated
at 3 miles
Cumulative residuals
400
-500
Vertical gap is possible ‘outlier’
The question was when a CURE plot is good enough.
SPF Workshop February 2014, UBCO
17
Computing the limits which a random walk
should seldom exceed. Details in text.
The last ‘cumulated squared residual’
SPF Workshop February 2014, UBCO
+2s’
-2s’
18
Guidance:
Rule of thumb: 95% within ±2s’.
This fit does not pass muster.
40% within ±0.5s’
Stop, you are in danger of
overfitting.
SPF Workshop February 2014, UBCO
19
Which fit is better?
Objective Function
b0
b1
∑ squared differences
1.656
0.870
∑ absolute differences
1.618
0.911
The steeper the run
the larger the bias;
Red increased A to B
bias.
Black is better
20
Summary for section 6. (Which fit is fitter?)
1. For SPFs the main figure of merit is when the fit is
unbiased everywhere;
2. For applications R2, χ2, AIC,... ‘overall’ measures are of
limited use;
3. The usual plot of residuals is not informative; the
CURE plot opens one’s eyes;
4. We show how to compute bias and Total
Accumulated Bias. The cost of parametric C-F was
manifest;
5. It is clear what a good CURE plot should look like;
6. By adding a constraint we eliminated overall bias;
SPF Workshop February 2014, UBCO
21
7. We computed ±2s’ limits and provided guidance on
when a CURE plot is acceptable and when overfitting is
a danger;
8. We showed how to decide which of two CURE plots is
better.
9. All fits were bad. Perhaps, partly, because minimizing
SSD is not good since crash count distributions are not
symmetrical. What should be optimized? Next.
SPF Workshop February 2014, UBCO
22
Download