10. Choosing the f() Eμ f(X ,, X n , β1 , ...,β m ) CH1. What is what CH2. A simple SPF 1 CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing the objective function CH8: Theoretical stuff (skip) Ch9: Adding variables. CH10: Estimate accuracy (missing) CH10. Choosing a model equation Now that we discussed the choice of objective function and addition of variables, we focus on the function in which variables combine In this Session (mainly): 1. What functions to try? 2. Going for a better fit. 3. What functions look like. 4. Adding ‘Terrain’ as variable. 5. The modeling process. 1 Eμ f(X1 ,, X n , β1 , ...,β m ) variables Parameters Determining what function hides behind the noisy data is key to getting good estimates of Em and sm. SPF workshop February 2014 2 Aristotle Why “key to...”? “Acceleration of free falling body ∝ its weight One could use data to estimate the proportionality constant. This f() would give poor predictions and be of no practical use. 384-322 B.C. Galileo With Aristotle’s f() space travel would not work. SPF workshop February 2014 1564-1642 3 “Aristotle maintained that women have fewer teeth than men; although he was twice married, it never occurred to him to verify this statement by examining his wives' mouths.” Bertrand Russell SPF workshop February 2014 4 Functions are primary, parameters secondary β L So far we used to represent the contribution of L. 1 In the future we will try L β1L and other functions 2 The two β1’s make different contributions to E{μ). Moral: Parameters get their meaning from the function in which they feature. If so, why so little attention to getting the function right? SPF workshop February 2014 5 Why so little attention to f()? Generic Eμ f(X1 ,, X n , β1 , ...,β m ) Many modellers state, without giving X reasons, that e is the ‘f’, and proceed to parameter estimation. Why? i i SPF workshop February 2014 6 The path: From To Eμ f(X1 ,, X n , β1 , ...,β m ) E μ =e ∀i β i X i In 4 questionable steps As extracted from the Handbook of econometrics SPF workshop February 2014 7 From 1 Eμ f(X1 ,, X n , β1 , ...,β m ) For historical and convenience in estimation Σfi(X1, X2, …)βi 2 (Only linear in parameters) It is desirable to identify effects separately Σfi(Xi)βi 3 Same f “for ease of computation and interpretation and for aesthetic reasons.” Σf(Xi)βi 4 Assume a specific ‘f f(Xi)= Xi To E μ = e SPF workshop February 2014 ∀i β i X i 8 “The computer models of economists have to use equations that represent human behaviour; by common consent, they do it amazingly badly” Jon Turney, A model world, December 16, 2013 SPF workshop February 2014 9 In SPFs From Eμ f(X1 ,, X n , β1 , ...,β m ) To E μ = β0 × f1 (X1 , 𝛃1 )× f2 (X2 , 𝛃2 )×… Multiplicative and (usually) single-variable factors But Not all functions the same Not linear in parameters Functions not preselected (Not E μ = e ∀i β i X i ) SPF workshop February 2014 10 Finding the right f() is difficult I will give you data about Y, X1, and X2 You find f in Y=f(X1, X2) Object Measurements 1 2 X1 [m] 2.06 7.64 … 99 100 4.37 10.03 X2 [m] 7.91 5.51 Y [m] 8.18 10.10 3.60 5.66 4.08 10.77 SPF workshop February 2014 11 EDA It looks like Y=β0+β1X1+ β2X2 should do The parameter estimates were 0.35, 0.73 and 0.66 Good correspondence! 12 The elusive f() Reality data Y X1 X2 The researcher choose Y=β0+β1X1+ β2X2 But should have chosen Y X X β1 1 β 2 β3 2 Moral 1: This β (0.73)has nothing to do with this one (1.98) 13 Moral 2: f() is nearly unfathomable. Few would have chosen Pythagorean model equation if the theory was not known. My students didn’t. Moral 3: Is Occam’s razor good advice? SPF workshop February 2014 14 Moral 4: However reasonable the choice of Y=β0+β1X1+ β2X2 , it is wrong to say that when X1 is increased by 1m then Y will increase Y by β1= 0.73 m. If X1<<X2 then increase in Y is close to 0; if X2<<X1 then increase in Y is close to 1. The regression parameter can tell the result of a manipulation only when f() is right Implication for SPFs & CMFs If we do not know what the true f() is, regressions may predict well what E{μ} is, but cannot be trusted to predict how E{μ} will change if a variable is changed. 15 Moral 5: The models we use are either additive or multiplicative and made up of single-variable building β β β blocks. But Y X1 X 2 is neither. 1 2 3 So, even simple phenomena may not be represented by commonly used model forms. In sum, f() is elusive. If the aim is to get the CMF, f() must be right. If the aim is to get good estimates of E{μ}, f() does not have to be right. SPF workshop February 2014 16 Searching for suitable f()’s So far we used E μ = β0 [1 − βslope (Year − 1986)] Lβ 1 AADT β 2 The ’s were estimated as if the function was the right. Is it? Very, very unlikely. There is no theory behind this f() The only guides are: a) Parsimony of parameters b) Quality of fit c) EDA If f() is not right, how may the parameters be used? SPF workshop February 2014 17 Searching for suitable f()’s The tools for finding the right f() are not well developed. As was shown, without a theory even a simple f() is difficult to find. The Modellers Tantalus's punishment was ‘temptation without satisfaction’ SPF workshop February 2014 18 Other functions to be tried, e.g.: Power Polynomial Hoerl ... Mixtures Xβ X+X2 X β1 eβ2 X Which will fit better? To choose well one has to know what functions look like SPF workshop February 2014 19 What do functions (equations) look like? A visualization tool. Open #14: ‘Visualise functions.xlsx’ on ‘The Tool’ workpage Three panels: Basic, Modifier, and Composite The ordinates The parameters The argument (abscissa) 20 This is what the ‘basic’ functions look like 21 2. Polynomial 1. Power 0 0.5 1 X 1.5 2 0 0.5 1 3. Logistic 1.5 X 2 0 0.5 1 1.5 2 X Can you make the ‘power’ function bend down? Can you give the ‘polynomial’ a maximum? Lower the ‘logistic’ at x=0.5 while keeping its value at x=1 SPF workshop February 2014 22 Three panels: Basic, Modifier, and Composite The ordinates The parameters SPF workshop February 2014 23 SPF workshop February 2014 24 This is what the ‘modifier’ functions look like 25 The ‘Composite’ functions Modifier Hoerl=Power*Exponential Logistic*Linear SPF workshop February 2014 26 What composite functions look like Power & Exponential (Hoerl) a. Exponential 1. Power 1.2 1.0 0.8 × 0.6 = 0.4 0.2 0.0 0 0.5 1 X 1.5 2 0 0.5 1 1.5 X 2 0 0.5 1 1.5 2 X Can you make Hoerl loose its peak? SPF workshop February 2014 27 Comparing fits – a hitch One can usually improve the fit by using functions with more parameters. In the limit .... Same number of parameters; Fits can be compared. Not the same number of parameters; How then to compare fits? SPF workshop February 2014 28 The danger of ‘overfitting’. What to do? 1. SSD must be larger than the sum of fitted values. 2. By AIC (Akaike Information Criterion): Add parameter if it increases Ln(maximized likelihood) by more than 2.7. 3. By BIC (Bayesian Information Criterion): Add parameter if Ln(maximized likelihood) is increased by more than Ln(Number of data points)/2 SPF workshop February 2014 29 Trying for a better fit We used E μ = β0 [1 − βslope (Year − 1986)Lβ 1 AADTβ 2 Would it be better if 𝐿𝛽1 was replaced by 𝐿 + 𝛽1 𝐿2 or if AADTβ 2 eβ 3 AADT was used instead of AADTβ 2 ? Open :#15 ‘Base for fit improvements’ on ‘Power’ workpage This is where we left off SPF workshop February 2014 30 Now, still on #15 go to ‘Polynomial’ workpage Replace by (B8+$CB$2*B8^2) and copy down. That’s all. Now use ‘SOLVER’. 31 Model equation Power Polynomial Increase Log-Likelihood -26329.0 -26311.5 17.5 Increase in likelihood with no addition of parameters is good. Caution: Polynomials are risky Even though log-likelihood increased by a factor of e17.5 the CURE plot did not change much. SPF workshop February 2014 32 , In praise of the ‘Solver’ Changing the functional form was straightforward. We replaced by Note: mixes addition and multiplication but ‘Solver’ did not choke! Modellers tend to use only linearized expressions in which addition and multiplication do not mix. Why? 33 Recall: Residual ≡ Observed - Fitted In origin to A fitted is too small; The only way to increase it is to allow positive intercept. SPF workshop February 2014 34 Add Intercept Still on #15 go to ‘Add intercept’ workpage Increase in log-likelihood=? Justified by AIC, BIC? Practically important? SPF workshop February 2014 35 We are still “Trying for a better fit” For AADT I replaced ‘Power’ by the sigmoids: ‘Logistic’, ‘Weibull’ and ‘Hoerl’ Similar loglikelihoods but differing predictions when AADT>10,000 SPF workshop February 2014 36 Footloose! Which of the many alternative functions to choose? (The uncertainty due to this source is never considered) Reporting 𝐴𝐴𝐷𝑇 0.939 𝐸 𝜇 = 0.007 + 0.173[1 − 0.020 𝑌𝑒𝑎𝑟 − 1986)] 𝐿 + 0.066𝐿2 )( ) 1000 2 𝐸 𝜇 𝑉 𝜇 = 2.126(𝑆𝑒𝑔𝑚𝑒𝑛𝑡 𝐿𝑒𝑛𝑔𝑡ℎ} SPF workshop February 2014 37 Moral: Choice of function matters. The estimate of E{μ) may depend strongly on what function the modeler chooses. Therefore, uncertainty of prediction is not only (mainly?) a matter of statistical inaccuracies. SPF workshop February 2014 38 CURE plots still bad. What to do? SPF workshop February 2014 39 Adding the ‘Terrain’ variable For each segment we have data about ‘Terrain’ (F, R or M). Terrain is a proxy for ‘grade’, ‘curvature’ etc. for which we do not have data. Is ‘Terrain’ safety-relevant? VIEDA (Pivot) Terrain Flat Mountainous Rolling Grand Total Observed Accidents 1882 11273 8563 21718 Fitted Values 3480.2 8609.6 9628.2 21718.0 Observed/ Fitted 0.54 1.31 0.89 1.00 Illustrates bias-in-use if ‘Terrain’ is not in model 40 Option 1. Add terrain by two multiplier parameters Open #16. NB fit with terrain multipliers Terrain added to data Added column for terrain multiplier 41 Add two parameters Click ‘SOLVER’ Note L + β1 L2 SPF workshop February 2014 42 The evolution of the of L Initially: E{m}∝L Objective Function Weighted LS Poisson Likelihood NB Likelihood Absolute differences Chi Squared Total Absolute Bias Power of L 0.87 When only L in model 0.86 0.87 0.91 0.74 0.74 After AADT added in L =1.08 and in L+L2 it is 0.076 After ‘Terrain’ added 0.005 in L+L2 Back to SPF workshop February 2014 43 Lessons: 1. Modeling is a search involving trial and error 2. It is not clear when the search should end 3. All conclusions are provisional SPF workshop February 2014 44 Option 2. Fit separate models using data by terrain intercept 0 slope 1 2 Terrain Flat 0.006 0.073 -0.0005 0.122 0.941 Rolling 0.007 0.186 -0.016 0.0007 0.850 Mountainous 0.029 0.289 -0.020 0.0004 0.860 Sum 𝒷 Log-Lik. 2.333 4.983 1.846 -4755 -13393 -7635 -25741 Which option to choose, multipliers or separate models? Likelihood increased by 134 (from -25875 to -25741) but parameters increased by 12. Should we test the hypothesis of no difference? SPF workshop February 2014 45 Is there a practically significant difference? Yes! SPF workshop February 2014 46 An unexpected turn Adding terrain did not cure the CURE But... 47 SPF workshop February 2014 48 The modeling process SPF workshop February 2014 49 Summary for section 10. (Choosing a model equation) 1. Finding function behind the data is important; don’t just assume it. Parameter estimation secondary. 2. Considerations: EDA, Simplicity and fit, not theory. 3. Overfitting and the AIC-BIC crutch. 4. We tried a few (Power, Polynomial, Hoerl). 5. The ‘Solver’ did not choke. 6. The choice of function was seen to matter. 7. Uncertainties are not mainly statistical; model equations are not laws of nature. SPF workshop February 2014 50 8. What do various equations look like? 9. How to adapt the C-F spreadsheet to ‘basic’ & ‘multiplier’. 10. Two ways of adding ‘Terrain’. 11. Depicting the SPF modeling process. SPF workshop February 2014 51 In closing Objectives: 1. How to develop SPFs using a spreadsheet 2. To promote understanding What are SPFs for? This determines the direction. Can they deliver SPFs? I doubt it. Building blocks (How to do EDA, how to fit function,...) and Tools (Pivot Table, Solver, CURE,...) Snakes and Ladders modeling SPF workshop February 2014 52 We discussed elements of modeling with road safety data Good modeling requires a thoughtful modeller You are it 53