7. what to optimize.pptx

advertisement
7. What to Optimize?
CH1. What is what
CH2. A simple SPF
CH3. EDA
CH4. Curve fitting
CH5. A first SPF
CH6: Which fit is fitter
CH7: Choosing the objective function
CH8: Theoretical stuff
Ch9: Adding variables
CH10. Choosing a model equation
In this session:
1. Can one do better by optimizing something else?
2. Likelihood, not LS?
3. Using a handful of likelihood functions.
SPF workshop February 2014, UBCO
1
Perhaps the fit was bad because:
β
• The function Êμ  β 0 X is not good
later
• Important traits are missing
• The objective function is not appropriate Now
1
Carl Friedrich Gauss
1777-1855
The two common methods:
 Least Squares
 Maximum Likelihood
Sir Ronald Fisher
1890-1962
SPF workshop February 2014, UBCO
2
Introducing ‘Likelihood’
In statistics, maximum-likelihood estimation (MLE) is a
method of estimating the parameters of a statistical
model. When applied to a data set and given a
statistical model, maximum-likelihood estimation
provides estimates for the model's parameters.
From Wikipedia
Popular in SPF modeling
SPF workshop February 2014, UBCO
3
Example 1: Get the ML estimate of a m
Year
Crashes
1
1
2
7
3
4
4
0
What is the probability of 1, 7, 4, 0 if μ=2.0 crashes/year?
Open #9. ‘Likelihood functions’ on ‘Poisson’ workpage
SPF workshop February 2014, UBCO
4
The ‘Likelihood Function’
The likelihood of μ=4.0
The likelihood of μ=2.0
ℒ(.) will be used to denote a likelihood function. The
dot in the parenthesis is a placeholder for parameters.
Thus, e.g., ℒ(μ) is the is the likelihood function of μ.
5
Computing likelihood at very many μ’s we would
see a smooth curve - the ‘likelihood function’.
3.E-05
Probability to observe
1, 7,4 and 0 accidents
Likelihood
2.E-05
1.E-05
0.E+00
0
3
6
Means
9
SPF workshop February 2014, UBCO
The m at which
Observing
1 & 7 &4 & 0 is
most probable
6
The parameter value at which the likelihood
function has its peak is the ‘Maximum Likelihood’
(ML) estimate of that parameter.
It is not the most probable value of the parameter.
It is the parameter value at which the observations
are most probable.
SPF workshop February 2014, UBCO
7
With the 1, 7, 4, 0 crash record, which μ is most likely?
Return to #9 on the ‘Poisson ML’ workpage
The ‘Target’ cell
Use ‘Solver’
The ‘By Changing’ cell
Show that ML estimate of m is 3.00.
SPF workshop February 2014, UBCO
8
Example 2: What distribution fits the data?
Continue now to the ‘Does the Poisson fit’ workpage.
Number of drivers out of 29531 who,
during 1931-1936 had 1 accident.
If all drivers had the same m
then one would expect n(k) to
be consistent with the Poisson
distribution.
Is it?
SPF workshop February 2014, UBCO
The Data
9
Here the Poisson
predicts too few
crashes
1.500
If Poisson was a good fit
1.000
0.500
Answer: ....
0.000
0
1
SPF workshop February 2014, UBCO
2
3
4
5
6
7
10
What distribution does fit the data?
Continue to the ‘NegBin ML Empty’ workpage in #9.
Poisson applies
to population
of units that all
have the same
m
NB applies to
populations of units
where each unit may
have a different m
and the m’s are
Gamma distributed
11
1781-1840
1817-1951
μk e−μ
P(K = k) =
k!
Parameters to be estimated
Will NB fit the data?
12
Question 1: What are the
ML estimates of ‘a’ and ‘b’?
(both must be positive)
Question 2: With these ‘a’
and ‘b’ how good is the
correspondence between the
observed and fitted n(k)
SPF workshop February 2014, UBCO
13
Preparing the likelihood function for Solver
Initial guesses
The likelihood function is the product of many small
probabilities. To avoid computational difficulties we
use the log-likelihood
‘product’ replaced by ‘sum’.
SPF workshop February 2014, UBCO
14
B*C
The probability than n(k) units have k accidents is P(K=k)n(k)
The log-likelihood is the sum over all k of n(k)ln[P(K=k)]
Now we are ready to estimate ‘a’ and ‘b’
SPF workshop February 2014, UBCO
15
‘a’ and ‘b’ must be non-negative
ML estimates
Does this NB fit the data?
SPF workshop February 2014, UBCO
16
Was the NB assumption for populations reasonable?
(By method of moments in
1.4 we got 3.55 and 0.85)
Numbers
expected if NB
and parameters
both were true.
SPF workshop February 2014, UBCO
17
Now the ground is ready:
The Poisson Likelihood function for SPF curve-fitting
μk e−μ
P(K = k) =
k!
Does not matter
Log likelihood    μ i   k i lnμ i 
n
Chapter 8
1
Replace μ i by β 0 (Segment Length)β
and you have a function of β 0 and β1.
Now you can find values of β 0 and β1
which make the log-likelihood largest.
1
SPF workshop February 2014, UBCO
18
The C-F spreadsheet for Poisson likelihood function.
Go to: #10.Poisson fit (Full).xlsx on ‘Poisson’ workpage
Log likelihood    μ i   k i lnμ i 
n
1
Sum of log-likelihoods
Our model equation (for now)
Formula:=-E8+D8*LN(E8), copy down 19
Click
SOLVER solution
Very similar to OLS,
No point in CURE
The Negative Binomial Likelihood Function
The Poisson L-F solves the ‘equal variances’ problem.
However, it has a problem of its own – no overdispersion.
To illustrate, 91 of 5323 segments are 0.01 miles long.
For these,
Sample Variance of accident counts =0.114,
Sample Mean of accident counts =0.098.
If Poisson, Variance=Mean
If Variance>Mean, Overdispersion, not Poisson
SPF workshop February 2014, UBCO
21
(Sample variance)/(Sample Mean)
50 segment length bins
If Poisson
Segment Length [miles]
SPF workshop February 2014, UBCO
22
The NB Likelihood Function continued
Assumptions: Common and Different
Poisson
Crash Counts for each unit
are Poisson distributed
Negative Binomial
Crash Counts for each unit
are Poisson distributed
Units with the same traits in
Units with the same traits in
the model equation have
the model equation have
μ’s that comes from a
the same μ
Gamma distribution
SPF workshop February 2014, UBCO
23
• The Gamma pdf can take on a variety of shapes.
• Limitations.
E{μ}=b/a,
VAR{μ}=(E{μ})2/b
For many populations NB fits.
Ergo: Gamma is often OK.
24
Implementing the NB on a C-F spreadsheet.
Go to #11 NB fit.xlsx
Sum of log-likelihoods
Our model equation (for now)
Log-likelihood for segment 1
SPF workshop February 2014, UBCO
25
Modifying the Poisson C-F spreadsheet to NB
ln ℒ ∗ (β0 , β1 , … , 𝒷 ) =
n
[lnΓ(k i + 𝒷Li ) − lnΓ(𝒷Li ) + 𝒷Li ln(bLi ) + k i ln E μi
=
i=1
− (𝒷Li + k i )ln(𝒷Li + E{μi })]
Details in text
=IF(OR(B8<=0,C8<=0,E8<=0),0,GAMMALN(D8+$G$2*B8)GAMMALN($G$2*B8)+$G$2*B8*LN($G$2*B8)+D8*LN(E8)($G$2*B8+D8)*LN($G$2*B8+E8))
Add cell for
new parameter
26
SOLVER solution
Very similar to OLS,
and Poisson
27
Example of use:
What are the estimates of E{μ} and VAR{μ} for a
0.7 mile long segment (Colorado, two-lane,...)?
Answer:
• Estimate of E{μ}=1.636×0.70.871=1.20 I&F crashes in
5 years;
• V μi =(E μi})2/bi and bi= ×Li
Estimate of VAR{μ}=1.202/(0.531*0.7)=3.87 (I&F...)2
1.20±1.97, must reduce uncertainty!
SPF workshop February 2014, UBCO
28
In this session we asked: What to Optimize?
Traditionally:
1. Minimize (weighted) SSD
2. Maximize Likelihood
Both are motivated by focus on parameters
When the focus is on ‘How to predict well’,
other criteria emerge:
1. Minimize absolute deviations
2. Minimize Total Absolute Bias
2
χ
3. Minimize , etc.
In lecture notes.
SPF workshop February 2014, UBCO
29
Summary for Chapter 7.
1. Instead of minimizing SSD (which gave poor fits), we
asked whether fit is improved by maximizing likelihood;
2. Likelihood was explained and illustrated;
3. To write a likelihood function one must make
assumptions. The assumptions behind the Poisson and
NB likelihoods were discussed;
4. We used the Poisson likelihood function. The fit was
not improved.
5. One of the assumptions behind the Poisson likelihood
is not realistic. The NB likelihood function removes the
blemish.
SPF workshop February 2014, UBCO
30
6. The estimate of the shape parameter ‘b’ is
needed for tasks such as blackspot
identification and EB safety estimation;
7. The fit is still not very good. Can it be improved
by using a better model equation?
SPF workshop February 2014, UBCO
31
Download