Class 7: EHA Models 5

advertisement
More EHA Models &
Diagnostics
Sociology 229A: Event History Analysis
Class 7
Copyright © 2008 by Evan Schofer
Do not copy or distribute without permission
Announcements
• Assignment #5 due
• Final paper assignment handed out
• Due at end of quarter
• Class topic:
•
•
•
•
AFT models
Stratified Models
More on residuals, diagnostics
Discussion: Empirical Paper
Short Paper Assignment
• New Topic: Organizational mortality among
“licensed lenders”
• A type of credit company regulated by New York state
– “Mom & pop” lenders… eventually largely outcompeted by
modern banks/credit cards…
– Examples:
• Empire City Personal Loan Company
– Founded 1932, Dissolved 1938
• American Credit Company
» Renamed “Liberty Loan Company” in 1942
– Founded 1902, Dissolved 1964
– Branch office in 1947; dissolved in 1955
– Branch office in 1955; censored in 1965.
Short Paper Assignment
• Licensed lenders dataset
– Unit of analysis: Organization
• Branch offices each have an independent government
license, are treated as fully separate organizations
– Data structure:
• Annual data set
– Time-series / “Long form”, split-spell data
– Outcome of interest: Organizational mortality
• When the organization dies/dissolves/shuts down
– Rudimentary independent variables included…
Short Paper Assignment
• Project goals:
– 1. Test a series of hypotheses (which I provide)
using EHA models
– 2. Run some simple EHA diagnostics
• Check proportionality assumption for one X var
• Check for outliers using residuals
– 3. Write up results (4-5 pages)
• Like the methods/results section of a short journal
article…
Accelerated Failure Time Models
• We’ve been modeling the hazard rate: h(t)
• Most parametric approaches build on Cox strategy…
• An alternative approach: model log time
• Using parametric approach like exponential or Weibull
• Focus is time rather than hazard rate:
ln( t )  X  
• Where last term “e” is assumed to have a distribution
that defines the model (e.g., making it Weibull)
– Recall: odd distrubution of e is the problem with OLS
– What if we introduced a complex parameter here!
Accelerated Failure Time Models
• Cleves et al. 2004: AFT (or “log time) models
aren’t actually new kinds of models
• Rather, they are re-expressing the same models in a
different metric…
• Instead of expressing effects on hazard rate,
coefficients reflect effect on log time to event
• Instead of “hazard ratios” you can compute “time ratios”
– Substantive emphasis is on TIME to event
• This can be desirable… more concrete than haz rates
– Issue: coefficients have opposite signs!!!
• A variable that increases hazard rate will decrease
time to event.
Proportional Hazard vs. AFT
• Blossfeld data: Upward employment moves
. streg edu coho2 coho3 lfx pnoj pres if pres <=65, dist(exponential) nohr
Exponential regression -- log relative-hazard form
No. of subjects =
No. of failures =
Time at risk
=
Log likelihood
=
591
84
40161
-253.68509
Log relative hazard =
Number Proportional
of obs
= hazards
591
model
LR chi2(6)
Prob > chi2
=
=
131.39
0.0000
-----------------------------------------------------------------------------_t |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------edu |
.3020663
.0429622
7.03
0.000
.2178619
.3862708
coho2 |
.6366232
.2713856
2.35
0.019
.1047172
1.168529
coho3 |
.7340517
.2766077
2.65
0.008
.1919105
1.276193
lfx | -.0022632
.0020781
-1.09
0.276
-.0063363
.0018098
pnoj |
.1734636
.1003787
1.73
0.084
-.0232751
.3702022
pres |
-.143771
.0142008
-10.12
0.000
-.171604
-.115938
_cons | -5.116249
.6197422
-8.26
0.000
-6.330922
-3.901577
------------------------------------------------------------------------------
Proportional Hazard vs. AFT
• Blossfeld data: Upward employment moves
. streg edu coho2 coho3 lfx pnoj pres if pres <=65, dist(exponential) nohr time
Exponential regression -- accelerated failure-time form
No. of subjects =
No. of failures =
Time at risk
=
Log likelihood
=
591
84
40161
-253.68509
Number
Streg option “time”
specifies
of
obs
=AFT form
591
LR chi2(6)
Prob > chi2
=
=
131.39
0.0000
-----------------------------------------------------------------------------_t |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------edu | -.3020663
.0429622
-7.03
0.000
-.3862708
-.2178619
coho2 | -.6366232
.2713856
-2.35
0.019
-1.168529
-.1047172
coho3 | -.7340517
.2766077
-2.65
0.008
-1.276193
-.1919105
lfx |
.0022632
.0020781
1.09
0.276
-.0018098
.0063363
pnoj | -.1734636
.1003787
-1.73
0.084
-.3702022
.0232751
pres |
.143771
.0142008
10.12
0.000
.115938
.171604
_cons |
5.116249
.6197422
8.26
0.000
3.901577
6.330922
------------------------------------------------------------------------------
Note that log likelihood and T/Z values are the same.
However, all signs are opposite & in a different scale.
Proportional vs. AFT metric
• Weibull models: Here, coefficients differ…
. streg edu coho2 coho3 lfx pnoj pres if pres <=65, dist(weibull) nohr
Weibull regression -- log relative-hazard form
_t |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------edu |
.3004217
.0438282
6.85
0.000
.2145201
.3863234
coho2 |
.6259013
.2775622
2.25
0.024
.0818895
1.169913
coho3 |
.7189294
.2886739
2.49
0.013
.1531389
1.28472
lfx | -.0022896
.0020818
-1.10
0.271
-.0063698
.0017906
pnoj |
.1719096
.1007356
1.71
0.088
-.0255286
.3693478
pres | -.1430822
.0146639
-9.76
0.000
-.171823
-.1143414
_cons | -5.043614
.7361298
-6.85
0.000
-6.486402
-3.600826
. streg edu coho2 coho3 lfx pnoj pres if pres <=65, dist(weibull) nohr time
Weibull regression -- accelerated failure-time form
_t |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------edu | -.3048278
.046158
-6.60
0.000
-.3952959
-.2143598
coho2 |
-.635081
.2753596
-2.31
0.021
-1.174776
-.0953861
coho3 | -.7294735
.2817224
-2.59
0.010
-1.281639
-.1773078
lfx |
.0023232
.0021333
1.09
0.276
-.0018581
.0065045
pnoj | -.1744309
.1019852
-1.71
0.087
-.3743182
.0254564
pres |
.1451807
.0163841
8.86
0.000
.1130684
.1772929
_cons |
5.117586
.6280134
8.15
0.000
3.886702
6.348469
Accelerated Failure Time Models
• Remarks:
– 1. AFT models are less common, but you’ll run
across them occasionally
– 2. It is important to recognize them…
• Because coefficient interpretations are opposite!
– 3. STATA currently offers more parametric
options for AFT models
• Log-logistic and log-normal are only available in AFT
• These are non-monotonic curves, might be useful…
– So, you might consider them if you are having trouble with
model fit.
Parametric Models & Predictions
• Parametric models allow prediction of failure
times for all cases
• Whether using proportional hazard or AFT metric
– Strategy: run model, then use “predict” command
– Issues:
• 1. You have many prediction options…
– “Mean” estimated time; Median estimated time (+ log options)
• 2. If you have split-spell data, you’ll get a prediction for
EACH record in the data
– Predictions take into account X variables
– As X variables change, predicted time changes, too!
Predicted Times
• Blossfeld job data (upward moves)
. list id duration event sex time mdtime
1.
2.
3.
4.
14.
20.
21.
29.
30.
31.
37.
38.
39.
40.
41.
42.
43.
44.
+----------------------------------------------------+
| id
duration
event
sex
time
mdtime |
|----------------------------------------------------|
|
1
427
0
1
130.2342
90.27149 |
|
2
45
1
2
192.2021
133.2243 |
|
2
33
0
2
5651.612
3917.399 |
|
2
219
0
2
5131.651
3556.99 |
|
6
25
1
1
205.6662
142.557 |
|
7
5
1
2
116.0007
80.40555 |
|
7
14
0
2
416.3065
288.5616 |
| 10
120
1
1
690.877
478.8794 |
| 10
141
1
1
2412.739
1672.383 |
| 10
120
0
1
21855.97
15149.41 |
| 12
27
1
1
92.27634
63.96109 |
| 12
70
0
1
2605.027
1805.667 |
| 13
38
0
2
774.3403
536.7318 |
| 13
101
0
2
1094.581
758.7059 |
| 14
35
0
2
579.2303
401.4919 |
| 14
86
0
2
528.3259
366.2076 |
| 15
11
0
1
1612.258
1117.532 |
| 15
11
1
1
139.5957
96.76038 |
Predicted median time
is 80 months, actual
upward move occurred
in 5 months…
Model really
doesn’t expect this
case to have an
upward job
transition…
Parametric Models & Predictions
• Useful things you can do with predictions:
– 1. Highlight some examples to give your reader a
concrete sense of event timing…
– 2. Construct predictions that reflect different
values of X variables
• Ex: Run model. Make predictions. Recode Xs. Make
further predictions
– Example: How would the predicted time-to-event change if
case was male, rather than female
– Ex: Environmental treaties: What is predicted time to treaty
signing if democracy were 10 rather than 1?
• Vividly illustrates coefficient effects.
Residuals – Summary
• From Cleves et al. (2004) An Introduction to Survival
Analysis Using Stata, p. 184:
• 1. Cox-Snell residuals
• … are useful for assessing overall model fit
• 2. Martingale residuals
• Are useful in determining the functional form of the covariates to
be included in the model
• 3. Schoenfeld residuals (scaled & unscaled), score
residuals, and efficient score residuals
• Are useful for checking & testing the proportional hazard
assumption, examining leverage points, and identifying outliers
• NOTE: A residual is produced for each independent variable…
• 4. Deviance residuals
• Are useful fin examining model accuracy and identifying outliers.
Cox-Snell Residuals
• Cox-Snell residuals for case i:
CSresid i  Hˆ 0 (t ) exp( ˆX i )
• Where H(t)-hat is the estimate of the cumulative hazard
– Based on model results
• B-hats are estimates from the model
• Xi are values for each case in your data
– Interpretation: “The expected number of events in
a given time-interval”
– Box-Steffensmeier & Jones 2004.
Cox-Snell residuals: Model Fit
• Cox-Snell residuals can be plotted to assess
model fit
• If model fits well, graph of integrated (cumulative)
hazard conditional on Cox-Snell residuals vs. Cox-Snell
residuals will fall on a line
– Strategy in stata:
•
•
•
•
•
Run Cox model, request martingale residuals
Use “predict” to compute Cox-Snell residuals
Stset your data again, with Cox-Snell as time variable
Compute integrated hazard
Graph integrated hazard versus residuals.
Cox-Snell residuals: Model Fit
• Cox-Snell residuals can be plotted to assess
model fit
• If model fits well, graph of integrated (cumulative)
hazard conditional on Cox-Snell residuals vs. Cox-Snell
residuals will fall on a line
– Strategy in stata:
•
•
•
•
•
Run Cox model, request martingale residuals
Use “predict” to compute Cox-Snell residuals
Stset your data again, with Cox-Snell as time variable
Compute integrated hazard
Graph integrated hazard versus residuals.
Cox-Snell Model Fit Example
2
• Cox-Snell Plot for Environmental Law data
Note: Don’t
worry much
about deviations
from the line at
the right edge of
the plot. There
are typically few
cases there…
0
.5
1
1.5
This looks quite bad.
Cumulative hazard should
fall on the line… Instead,
there is a sizable gap.
0
.2
.4
partial Cox-Snell residual
Nelson-Aalen cumulative hazard
partial Cox-Snell residual
.6
Martingale Residuals
• Martingale residuals: More intuitive…
• Difference between observed event (vs. censored) and
expected number of events a case is predicted to have
– Based on hazard rate given X vars…
• Martingale residuals range from –infinity to +1
– Often very skewed
– Deviance residuals: Normalized version of
martingale residuals.
MG Residuals and Functional Form
• Issue: What functional form of independent
variables should you choose?
• Ex: Should you log your independent variables?
– Skewness is one consideration; but you also want to specify
the correct relationship between vars…
– In OLS regression we can plot X vars versus
residuals to identify departures from linearity
• In EHA, we can do something similar:
• Estimate Cox model without covariates, save
martingale residuals
• Use “lowess” command to plot mean residuals versus
X variables
• Functional form that is closest to a flat line = best.
MG Residuals and Functional Form
• Stata code:
*
* Use Martingale Residuals to check functional form
*
stset tf, fail(des)
* Estimate a cox model with NO covariates
* -- option "estimate" makes this happen
* Plus, create a new variable "mg" containing
* Martingale residuals
stcox , mgale(mg) estimate
* Next, plot residuals versus different transformations
* of your X variables (with smoothed mean – lowess)
lowess mg lfx
lowess mg lfxcubed
lowess mg loglfx
Martingale Functional Form Example
• Blossfeld employment termination data
• Should labor force experience be raw, logged, cubed?
1
Lowess smoother
0
Labor force
experience is
CUBED…
-1
Note the
SHARP curve
near zero…
Very non-linear
-2
This is really
bad.
0
bandwidth = .8
2.00e+07
4.00e+07
lfxcubed
6.00e+07
8.00e+07
Martingale Functional Form Example
• Blossfeld employment termination data
• Should labor force experience be raw, logged, cubed?
Lowess smoother
1
This is RAW
labor force
experience
-2
-1
0
Not bad… close
to a flat line.
0
bandwidth = .8
100
200
lfx
300
400
Martingale Functional Form Example
• Blossfeld employment termination data
• Should labor force experience be raw, logged, cubed?
Lowess smoother
1
Labor force
experience,
logged
-2
-1
0
This is the best
yet… but not a
big difference
from raw…
0
2
4
loglfx
bandwidth = .8
6
Discussion: Empirical Example
• Soule, Sarah A and Susan Olzak. 2004.
“When Do Movements Matter? The Politics of
Contingency and the Equal Rights
Amendment.” American Sociological Review,
Vol. 69, No. 4. (Aug., 2004), pp. 473-497.
Download