a+1

advertisement
Workshop Objectives:
a. Learn how to fit an SFP to data
b. Understand what SPFs can and cannot do
CH1. What is what
CH2. A simple SPF
CH3. EDA
CH4. Curve fitting
CH5. A first SPF
CH6: Which fit is fitter
CH7: Choosing the objective function
CH8: Theoretical stuff (skip)
Ch9: Adding variables
CH10. Choosing a model equation
SPF workshop UBCO February 2014
1
What is what.
1. What are SPFs?
2. What information do (should) they give us?
3. What is that information used for?
Loosely speaking, SPFs are tools that give
information about the safety of units such as road
segments, intersections, ramps, grade crossings …
What
is this?
SPF workshop UBCO February 2014
2
What is Safety?
Here is a count of injury
accidents for a Freeway
Segment in Colorado.
What is its SAFETY?
Here is a (monthly) count of
accidents for an Intersection
in Toronto.
What is its SAFETY?
Segment of urban freeway in Denver
Intersection in Toronto
SPF workshop UBCO February 2014
3
… “what is its safety?” implies that
SAFETY is a property of UNITS
What is a ‘Unit’?
A Unit can be a road segment, an intersection,
Mr. C.J. Smith, heavy trucks on the 401, etc.
SPF workshop UBCO February 2014
4
What is the Safety of a unit?
Had I defined: Safety = Accident
Counts that would mean that
safety improved from 1986 to
1987, deteriorated from 1987 to
1988 etc.
Such a definition is not useful for
safety management because
safety changes even if there is no
change in safety-relevant traits.
(Exposure, traffic control, physical
features, user demography, etc.)
1.9 mile long segment of 6-lane
urban freeway in Denver, Colorado
SPF workshop UBCO February 2014
5
We need a definition of the safety of a unit such
that, as long as the ‘safety-relevant’ traits of the
unit do not change, it’s ‘safety’ does not change.
Three period running averages;
Freeway Segment, Colorado
Thirteen period running averages,
Intersection, Toronto
One can rightly imagine that behind the fluctuations there is a
gradually changing safety property that is some kind of average
SPF workshop UBCO February 2014
6
Thirteen period running averages,
Intersection, Toronto
There are three elements in the graph:
1. Observed values
●
2. The invisible (unknown) safety property
3. Our estimate of the unknown property
SPF workshop UBCO February 2014
μ
○
7
What is the ‘safety of a unit’?
We are now ready.
Definition: The safety property of a unit is the
number of accidents by type and severity, expected
to occur on it in a specified period of time. It will
always be denoted by μ and its estimate by
Accident type
Accident Severity
PDO
Injury
Fatal
Rear-end
Angle
Single-vehicle
3.10
1.40
0.30
Pedestrian
1.70
0.90
0.10
0.20
0.10
0.02
0.05
0.03
SPF workshop UBCO February 2014
8
The ‘safety’ of a unit depends on its ‘traits’
We are gradually assembling the elements
needed to say with clarity what an SPF is.
Eventually it will be a function of ‘variables’.
What is the link between safety and variables?
SPF workshop UBCO February 2014
9
Traits & Safety
10
S-R traits
Definition: A trait is ‘safety-related’
if when it changes, μ changes.
Consequence: Units with the same s-r traits have the same μ.
Corollary: Units that differ in some s-r traits differ in μ‘s.
SPF workshop UBCO February 2014
11
Populations
Units that share some traits form a population of units.
Example, (1) rural, (2) two-lane road segments in (3) flat
terrain of (4) Colorado.
Because only some traits are
common the units differ in many s-r
traits and therefore differ in their μ
We will describe the safety of a
population by:
Mean of μ’s, E{μ} and
Standard deviation of μ’s, σ{μ}
SPF workshop UBCO February 2014
12
Populations: real and imagined
Example: segments of rural two-lane
roads in Colorado form a population
Their shared traits are:
(1) State: Colorado,
(2) Road Type: two-lane,
(3) Setting: rural.
A new population (subset)
(1) & (2) & (3) & (4) Terrain: flat.
SPF workshop UBCO February 2014
13
The more traits the fewer units.
Colorado data:
(1) & (2) & (3)
5323 segments
Their shared traits are:
(1) State: Colorado,
(2) Road Type: two-lane,
(3) Setting: rural,
Add: 2.5<Segment Length <3.5 miles 597 segments
Add: 1000<AADT<2000 vpd
119 segments
If bin is 2400<AADT<2420 there are no units
even in the rich data.
But the SPF will still provide estimate of E{µ}
for a population, albeit an ‘imagined ‘ one.
SPF workshop UBCO February 2014
14
Finally: “What is an SPF?”
A Safety Performance Function is a tool which for a
multitude of populations provides estimates of:
1. The mean of the μ’s in populations - E{μ} and
2. The standard deviation of the μ’s in these
populations - σ{μ}.
Notational conventions to remember
SPF workshop UBCO February 2014
15
Notational conventions to remember
μ - the expected number of crashes for a unit
- estimate of μ . Caret above always means:
estimate of ...
- Average of μ’s in a population of units.
E{.} always means ‘average or expected value of
whatever the dot stands for.’
- standard deviation of μ’s in a population
of units.
σ{.} always means standard deviation of
whatever the dot stands for.
SPF workshop UBCO February 2014
16
The information we get from an SPF is not about units; it
is always about a population of units.
When we use the SPF information to estimate the
safety of a specific unit we argue as follows: “This unit
has the same traits as the units in the population.
Therefore my best guess of its μ is E{μ}.”
SPF workshop UBCO February 2014
17
In interim summary
We needed to be clear about what is an SPF
To get there we had to say what we mean by ‘safety of a
unit’ and that it depends on its safety-relevant traits
Further, we had to mention that units that share some
safety-relevant traits form populations of units
The safety of a population of units
can be described by E{m} and s{m}
These are necessary for practical applications
An SPF provides estimates of E{m} and
s{m} for many populations
SPF workshop UBCO February 2014
18
What Ê{μ} and σ̂{μ} are needed for?
Two groups of applications:
Group I: We really need the E{m}.
Examples: (a)To judge what is deviant we have to
know what is ‘normal’ . (b) How different are the
E{m}‘s of segments with and without (say, paved
shoulders)?
Group II. We really need the μ of a specific unit and
E{m} helps us to estimate it. Examples: (a) is this road
segment a ‘blackspot’? (b) How did the μ of this unit
change from ‘before’ treatment to ‘after’ treatment?
SPF workshop UBCO February 2014
19
Group I: We need the E{μ} of a
population
Group II: We need the μ of a
unit
What is normal for a unit?
Is this unit a ‘blackspot’?
What might be the safety
benefit of treating it?
What was the safety benefit of
treating it
How different are the means of
two populations
{
}
To answer: Ê{μ} and σ̂ Ê{μ}
Ê{μ} , σ̂{Ê{μ}} and σ̂{μ}
SPF workshop UBCO February 2014
20
Is there a Group III?
Some believe that we want to know the function
linking E{m} and traits in order to be able to say how a
change in the level of a trait will affect the E{m} of units.
Opinions differ on whether such a use of an SPF can be
trusted.
I do not think so, and will give my reasons in Session 5.
I hope that by the end of the workshop there will be
more CMF skeptics.
21
What Ê{μ} and σ̂{μ} are used for?
A sequence of simple illustrations.
1. How many units are deviant? Go to ‘Spreadsheets to
accompany PowerPoints.’
Open Spreadsheet #1
2. How well will my screen work? ‘Connecticut Drivers’ on
‘1. Data’ workbook.
3. What will be the accident savings of a treatment?
4. How effective was the treatment?
SPF workshop UBCO February 2014
22
Preliminaries:
Get Ê{μ} and σ̂{μ}
Data
Connecticut drivers (1931-1936)
Crashes, (k)
0
1
2
3
4
5
6
7
Total =
Drivers, n(k)
23881
4503
936
160
33
14
3
1
29531
SPF workshop UBCO February 2014
23
Open workbook 2. Mean and variance estimates’ (of #1)
A
B
C
k n(k) B/B$11
0 23881 0.8087
1 4503 0.1525
2
936 0.0317
3
160 0.0054
4
33 0.0011
5
14 0.0005
6
3 0.0001
7
1 0.0000
29531
D
E
A * C (A-D$11)2*C
0.000
0.047
0.152
0.088
0.063
0.098
0.016
0.041
0.004
0.016
0.002
0.011
0.001
0.003
0.000
0.002
0.240
0.306
Computing
sample mean
and variance.
0.26
24
Stay on workbook 2. ‘Mean and variance estimates’ (of #1)
A
B
C
k n(k) B/B$11
0 23881 0.8087
1 4503 0.1525
2
936 0.0317
3
160 0.0054
4
33 0.0011
5
14 0.0005
6
3 0.0001
7
1 0.0000
29531
D
E
A * C (A-D$11)2*C
0.000
0.047
0.152
0.088
0.063
0.098
0.016
0.041
0.004
0.016
0.002
0.011
0.001
0.003
0.000
0.002
0.240
0.306
0.26 Estimate of V{μ},
𝜎{𝜇} =√0.26=0.51
Naturally σ{μ}>0.
Even is we used age, gender and exposure as
traits, there still would be differences
SPF workshop UBCO February 2014
25
Use Ê{μ} and σ̂{μ}for: Screening.
Question: What % is these drivers have a μ that is, say, more
than 5 times the mean? (μ>5*0.24=1.2 acc. in six years)
Open workbook 3. ‘How many High mu drivers’ (of #1)
GAMMADIST(μ, b, 1/a, TRUE)
SPF workshop UBCO February 2014
26
P(μ<1.20)
Answer:
1. Assume that μ are Gamma distributed.
2. Compute parameters of
3. Use Excel function GAMMADIST(μ, b, 1/a, TRUE)
4. P(μ<1.20)=0.99
5. There are (≈ 29,531*0.01=) 295 such (5 x) drivers
27
Use Ê{μ} and σ̂{μ} for: Screen Performance
Question: If we decide to ‘treat’ those 51 (out of 29,531)
who had 4 or more accidents how will such a screen do?
Connecticut drivers (1931-1936)
Crashes, (k)
0
1
2
3
4
5
6
7
Total =
Drivers, n(k)
23881
4503
936
160
33
14
3
1
29531
SPF workshop UBCO February 2014
28
To answer we have to determine how many of those
drivers with 4, 5, 6 or 7 crashes are truly ‘high μ’?
Open workbook 4. ‘Gamma with k=4, 5, 6, 7’ (of #1)
If in a population of unit μ is Gamma distributed
then the μ’s of those units with k crashes are also
Gamma distributed with
EB
SPF workshop UBCO February 2014
29
Modify formula in B7
and copy down
First answer: Amongst those
who recorded 4 crashes, 66%
have μ<1.2.
Do same for k=5, 6, and 7. Record.
SPF workshop UBCO February 2014
30
Use Ê{μ} and σ̂{μ} for: Screen Performance
k
n(k)
P(μ≤1.2)
False Positives
Correct Positives
4
33
0.66
22
11
5
14
0.49
7
7
6
3
0.33
1
2
7
1
0.20
0
1
Sums
51
30
21
Answer:
Of 295 with μ>1.2, 21 correctly
identified, 30 incorrectly
identified and the rest missed
274 missed
SPF workshop UBCO February 2014
31
Use Ê{μ} and σ̂{μ} for: Anticipating benefit
CMF≡
Expected accident ‘with’
Expected accident ‘without’
Preliminaries
Reduction in accidents=m(1-CMF)
Question: How many accidents will be saved if
treatment with CMF=0.95 is administered to
Connecticut drivers with k≥4?
SPF workshop UBCO February 2014
32
k+b
Recall that: E{μ|k}=
EB
a+1
Thus, e.g., for k=4, (4+0.85)/(3.55+1)=1.07 crashes
Open workbook 5. ‘Anticipating benefit’ workpage (of #1)
k
4
5
6
7
n(k)
33
14
3
1
(k+b)/(a+1)
1.07
1.29
1.51
1.73
n(k)*(k+b)/(a+1)
35.2
18.0
4.5
1.7
59.4
Expected reduction=59.4×(1-0.95)=2.97 acc. in six years.
SPF workshop UBCO February 2014
33
Use Ê{μ} and σ̂{μ} for: Research about CMF
The 51 drivers with k>=4 received some treatment.
Question: If treatment had no effect, and nothing
else changed, how many crashes are they expected
to have in a 6-year ‘after treatment’ period?
k
4
5
6
7
n(k)
33
14
3
1
(k+b)/(a+
1)
1.07
1.29
1.51
1.73
k+b
Just as before: E{μ|k}=
a+1
n(k)*(k+b)/(a+1)
35.2
18.0
4.5
1.7
59.4
SPF workshop UBCO February 2014
34
k
4
5
6
7
n(k)
33
14
3
1
(k+b)/(a+
1)
1.07
1.29
1.51
1.73
n(k)*(k+b)/(a+1)
35.2
18.0
4.5
1.7
59.4
How come that drivers with 227 accidents
are expected to have only 59.4?
Before: 4*33+5*14+6*3+7*1=227 crashes in six years
If ineffective, Expected After= 59 crashes in six years
227-59=168 Regression to mean!
SPF workshop UBCO February 2014
35
Summary of illustrations:
We used estimates of E{μ} and VAR{μ} to:
• Estimate how many deviant units are in a
population;
• Estimate how many deviants are in subpopulations
of units with many crashes (correct and false positives
and negatives);
• How many crashes will be saved and how many to
expect after an ineffective treatment.
SPF workshop UBCO February 2014
36
Two perspectives on SPF
E{m} and s{m} = f(Traits, parameters)
Applications
centered
perspective
Cause and effect
centered
perspective
The perspective determines how modeling is done
37
The perspective determines how modeling is done
E{m} and s{m} = f(Traits, parameters)
Here the question is: “How to
do modeling to get good
estimates of E{m} and s{m}?
Applications
centered
perspective
SPF workshop UBCO February 2014
38
The perspective determines how modeling is done
E{m} and s{m} = f(Traits, parameters)
Here the question is:” How
to do modeling to get the
right ‘f’ and parameters so
that I can compute the
change in E{m} caused by a
change in a trait.
Cause and effect
centered
perspective
SPF workshop UBCO February 2014
39
Summary of 1.
1. We defined ‘safety’;
2. The safety of a unit is determined by its s-r traits;
3. Units that share some traits form a population;
4. The safety of a population is described by E{μ} and σ{μ};
5. The SPF is ...
A Safety Performance Function is a tool which for a multitude of
populations provides estimates of:
1.
The mean of the μ’s in populations - E{μ} and its accuracy;
2.
The standard deviation of the μ’s in these populations - σ{μ}.
SPF workshop UBCO February 2014
40
Download