6. Which fit is fitter CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing the objective function CH8: Theoretical stuff Ch9: Adding variables CH11. Choosing a model equation In this session: 1. What makes for a good fit 2. Introducing the CURE plot 3. Eliminating ‘overall bias’ 4. The bias of a fit 1 5. Using the CURE plot What makes for a good fit? Common ‘goodness-of-fit’ measures: R2, χ2, AIC,... These are ‘overall’ (single-number) measures. For application SPF they are insufficient. Recall… Two perspectives on SPF E{m} and s{m} = f(Traits, parameters) Cause and effect centered perspective Applications centered perspective 2 • One judges the fit of a model by its residuals. • In SPFs for applications a fit is thought good only if the residuals are closely packed around 0 everywhere. Perhaps acceptable SPF Workshop February 2014, UBCO 3 But this one is not! Fitted is too small 20 Fitted is too large 0 0 10 20 30 40 50 60 -20 Variable value 70 80 90 Residual: Observed - Fitted 40 100 -40 The main figure of merit for SPFs: Unbiased Everywhere SPF Workshop February 2014, UBCO 4 The usual residual plot Informative? SPF Workshop February 2014, UBCO 5 But, when the same residuals are cumulated From spreadsheet Compute Residual → Cumulate → Plot SPF Workshop February 2014, UBCO 6 Residual: Observed - Fitted The CURE Plot Now one can see! 0-A, B-C, E-F: Observed>Fitted, not good; A-B, D-E, Fitted>Observed, bad; Where the drop is precipitous there may be outliers. SPF Workshop February 2014, UBCO 7 Benefits: 1. Chaos is replaced by clarity. 2. We can recognize a good model. 3. The cost of parameterization is clear.. (2) What should a good CURE plot look like? •Should not have long up or down runs •Should not have vertical drops •Should meander around the horizontal axis SPF Workshop February 2014, UBCO 8 (3) The cost of parametric curve fitting is now manifest Imposing the function 1.675×(Segment Length)0.866 on the data causes bias almost everywhere! Biased estimates No bias Bad decisions Real costs SPF Workshop February 2014, UBCO 9 How much bias is there? Accumulated Fitted Bias Accidents Accidents Origin to A 1899 1596 303 A to B B to C ... Bias/ Fitted Accident 0.19 854 1532 -688 -0.44 ... ... ... ... TAB=Total Accumulated Bias =303+|-688|+... SPF Workshop February 2014, UBCO 10 Levelling the playing field Open spreadsheet #7. OLS with constraint When the scale parameter is determined by ‘Solver’ the sum of fitted values is usually not the same as the sum of crash counts. This is a blemish. To remove this blemish, add constraint 11 How to add constraints click 12 Now click ‘Solve’ to get With constraint SPF Workshop February 2014, UBCO 13 When is a CURE plot good enough? Open (again): #7 OLS with constraint After SOLVER with constraint was used you should now see: Open: #8 CURE computations Copy values in columns A, B, D and E into CURE spreadsheet SPF Workshop February 2014, UBCO 14 Copied Important step: On ‘DATA’ tab choose ‘Sort’ and sort in ascending order by ‘miles’ SPF Workshop February 2014, UBCO 15 Now add columns E, F, and G, Note that for the last row (n=5323) the Cumulated Residuals=0. Why? F3+E4 C4-D4 SPF Workshop February 2014, UBCO 16 Below is a plot of segment length (column B) against cumulative residuals (column F) Upward drift means that in this range ‘observed’ tends to be consistently larger than ‘fitted’. 0 1 Segment Length 2 3 Truncated at 3 miles Cumulative residuals 400 -500 Vertical gap is possible ‘outlier’ The question was when a CURE plot is good enough. SPF Workshop February 2014, UBCO 17 Computing the limits which a random walk should seldom exceed. Details in text. The last ‘cumulated squared residual’ SPF Workshop February 2014, UBCO +2s’ -2s’ 18 Guidance: Rule of thumb: 95% within ±2s’. This fit does not pass muster. 40% within ±0.5s’ Stop, you are in danger of overfitting. SPF Workshop February 2014, UBCO 19 Which fit is better? Objective Function b0 b1 ∑ squared differences 1.656 0.870 ∑ absolute differences 1.618 0.911 The steeper the run the larger the bias; Red increased A to B bias. Black is better 20 Summary for section 6. (Which fit is fitter?) 1. For SPFs the main figure of merit is when the fit is unbiased everywhere; 2. For applications R2, χ2, AIC,... ‘overall’ measures are of limited use; 3. The usual plot of residuals is not informative; the CURE plot opens one’s eyes; 4. We show how to compute bias and Total Accumulated Bias. The cost of parametric C-F was manifest; 5. It is clear what a good CURE plot should look like; 6. By adding a constraint we eliminated overall bias; SPF Workshop February 2014, UBCO 21 7. We computed ±2s’ limits and provided guidance on when a CURE plot is acceptable and when overfitting is a danger; 8. We showed how to decide which of two CURE plots is better. 9. All fits were bad. Perhaps, partly, because minimizing SSD is not good since crash count distributions are not symmetrical. What should be optimized? Next. SPF Workshop February 2014, UBCO 22