7. What to Optimize? CH1. What is what CH2. A simple SPF CH3. EDA CH4. Curve fitting CH5. A first SPF CH6: Which fit is fitter CH7: Choosing the objective function CH8: Theoretical stuff Ch9: Adding variables CH10. Choosing a model equation In this session: 1. Can one do better by optimizing something else? 2. Likelihood, not LS? 3. Using a handful of likelihood functions. SPF workshop February 2014, UBCO 1 Perhaps the fit was bad because: β • The function Êμ β 0 X is not good later • Important traits are missing • The objective function is not appropriate Now 1 Carl Friedrich Gauss 1777-1855 The two common methods: Least Squares Maximum Likelihood Sir Ronald Fisher 1890-1962 SPF workshop February 2014, UBCO 2 Introducing ‘Likelihood’ In statistics, maximum-likelihood estimation (MLE) is a method of estimating the parameters of a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters. From Wikipedia Popular in SPF modeling SPF workshop February 2014, UBCO 3 Example 1: Get the ML estimate of a m Year Crashes 1 1 2 7 3 4 4 0 What is the probability of 1, 7, 4, 0 if μ=2.0 crashes/year? Open #9. ‘Likelihood functions’ on ‘Poisson’ workpage SPF workshop February 2014, UBCO 4 The ‘Likelihood Function’ The likelihood of μ=4.0 The likelihood of μ=2.0 ℒ(.) will be used to denote a likelihood function. The dot in the parenthesis is a placeholder for parameters. Thus, e.g., ℒ(μ) is the is the likelihood function of μ. 5 Computing likelihood at very many μ’s we would see a smooth curve - the ‘likelihood function’. 3.E-05 Probability to observe 1, 7,4 and 0 accidents Likelihood 2.E-05 1.E-05 0.E+00 0 3 6 Means 9 SPF workshop February 2014, UBCO The m at which Observing 1 & 7 &4 & 0 is most probable 6 The parameter value at which the likelihood function has its peak is the ‘Maximum Likelihood’ (ML) estimate of that parameter. It is not the most probable value of the parameter. It is the parameter value at which the observations are most probable. SPF workshop February 2014, UBCO 7 With the 1, 7, 4, 0 crash record, which μ is most likely? Return to #9 on the ‘Poisson ML’ workpage The ‘Target’ cell Use ‘Solver’ The ‘By Changing’ cell Show that ML estimate of m is 3.00. SPF workshop February 2014, UBCO 8 Example 2: What distribution fits the data? Continue now to the ‘Does the Poisson fit’ workpage. Number of drivers out of 29531 who, during 1931-1936 had 1 accident. If all drivers had the same m then one would expect n(k) to be consistent with the Poisson distribution. Is it? SPF workshop February 2014, UBCO The Data 9 Here the Poisson predicts too few crashes 1.500 If Poisson was a good fit 1.000 0.500 Answer: .... 0.000 0 1 SPF workshop February 2014, UBCO 2 3 4 5 6 7 10 What distribution does fit the data? Continue to the ‘NegBin ML Empty’ workpage in #9. Poisson applies to population of units that all have the same m NB applies to populations of units where each unit may have a different m and the m’s are Gamma distributed 11 1781-1840 1817-1951 μk e−μ P(K = k) = k! Parameters to be estimated Will NB fit the data? 12 Question 1: What are the ML estimates of ‘a’ and ‘b’? (both must be positive) Question 2: With these ‘a’ and ‘b’ how good is the correspondence between the observed and fitted n(k) SPF workshop February 2014, UBCO 13 Preparing the likelihood function for Solver Initial guesses The likelihood function is the product of many small probabilities. To avoid computational difficulties we use the log-likelihood ‘product’ replaced by ‘sum’. SPF workshop February 2014, UBCO 14 B*C The probability than n(k) units have k accidents is P(K=k)n(k) The log-likelihood is the sum over all k of n(k)ln[P(K=k)] Now we are ready to estimate ‘a’ and ‘b’ SPF workshop February 2014, UBCO 15 ‘a’ and ‘b’ must be non-negative ML estimates Does this NB fit the data? SPF workshop February 2014, UBCO 16 Was the NB assumption for populations reasonable? (By method of moments in 1.4 we got 3.55 and 0.85) Numbers expected if NB and parameters both were true. SPF workshop February 2014, UBCO 17 Now the ground is ready: The Poisson Likelihood function for SPF curve-fitting μk e−μ P(K = k) = k! Does not matter Log likelihood μ i k i lnμ i n Chapter 8 1 Replace μ i by β 0 (Segment Length)β and you have a function of β 0 and β1. Now you can find values of β 0 and β1 which make the log-likelihood largest. 1 SPF workshop February 2014, UBCO 18 The C-F spreadsheet for Poisson likelihood function. Go to: #10.Poisson fit (Full).xlsx on ‘Poisson’ workpage Log likelihood μ i k i lnμ i n 1 Sum of log-likelihoods Our model equation (for now) Formula:=-E8+D8*LN(E8), copy down 19 Click SOLVER solution Very similar to OLS, No point in CURE The Negative Binomial Likelihood Function The Poisson L-F solves the ‘equal variances’ problem. However, it has a problem of its own – no overdispersion. To illustrate, 91 of 5323 segments are 0.01 miles long. For these, Sample Variance of accident counts =0.114, Sample Mean of accident counts =0.098. If Poisson, Variance=Mean If Variance>Mean, Overdispersion, not Poisson SPF workshop February 2014, UBCO 21 (Sample variance)/(Sample Mean) 50 segment length bins If Poisson Segment Length [miles] SPF workshop February 2014, UBCO 22 The NB Likelihood Function continued Assumptions: Common and Different Poisson Crash Counts for each unit are Poisson distributed Negative Binomial Crash Counts for each unit are Poisson distributed Units with the same traits in Units with the same traits in the model equation have the model equation have μ’s that comes from a the same μ Gamma distribution SPF workshop February 2014, UBCO 23 • The Gamma pdf can take on a variety of shapes. • Limitations. E{μ}=b/a, VAR{μ}=(E{μ})2/b For many populations NB fits. Ergo: Gamma is often OK. 24 Implementing the NB on a C-F spreadsheet. Go to #11 NB fit.xlsx Sum of log-likelihoods Our model equation (for now) Log-likelihood for segment 1 SPF workshop February 2014, UBCO 25 Modifying the Poisson C-F spreadsheet to NB ln ℒ ∗ (β0 , β1 , … , 𝒷 ) = n [lnΓ(k i + 𝒷Li ) − lnΓ(𝒷Li ) + 𝒷Li ln(bLi ) + k i ln E μi = i=1 − (𝒷Li + k i )ln(𝒷Li + E{μi })] Details in text =IF(OR(B8<=0,C8<=0,E8<=0),0,GAMMALN(D8+$G$2*B8)GAMMALN($G$2*B8)+$G$2*B8*LN($G$2*B8)+D8*LN(E8)($G$2*B8+D8)*LN($G$2*B8+E8)) Add cell for new parameter 26 SOLVER solution Very similar to OLS, and Poisson 27 Example of use: What are the estimates of E{μ} and VAR{μ} for a 0.7 mile long segment (Colorado, two-lane,...)? Answer: • Estimate of E{μ}=1.636×0.70.871=1.20 I&F crashes in 5 years; • V μi =(E μi})2/bi and bi= ×Li Estimate of VAR{μ}=1.202/(0.531*0.7)=3.87 (I&F...)2 1.20±1.97, must reduce uncertainty! SPF workshop February 2014, UBCO 28 In this session we asked: What to Optimize? Traditionally: 1. Minimize (weighted) SSD 2. Maximize Likelihood Both are motivated by focus on parameters When the focus is on ‘How to predict well’, other criteria emerge: 1. Minimize absolute deviations 2. Minimize Total Absolute Bias 2 χ 3. Minimize , etc. In lecture notes. SPF workshop February 2014, UBCO 29 Summary for Chapter 7. 1. Instead of minimizing SSD (which gave poor fits), we asked whether fit is improved by maximizing likelihood; 2. Likelihood was explained and illustrated; 3. To write a likelihood function one must make assumptions. The assumptions behind the Poisson and NB likelihoods were discussed; 4. We used the Poisson likelihood function. The fit was not improved. 5. One of the assumptions behind the Poisson likelihood is not realistic. The NB likelihood function removes the blemish. SPF workshop February 2014, UBCO 30 6. The estimate of the shape parameter ‘b’ is needed for tasks such as blackspot identification and EB safety estimation; 7. The fit is still not very good. Can it be improved by using a better model equation? SPF workshop February 2014, UBCO 31