More EHA Models & Diagnostics Sociology 229A: Event History Analysis Class 7 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission Announcements • Assignment #5 due • Final paper assignment handed out • Due at end of quarter • Class topic: • • • • AFT models Stratified Models More on residuals, diagnostics Discussion: Empirical Paper Short Paper Assignment • New Topic: Organizational mortality among “licensed lenders” • A type of credit company regulated by New York state – “Mom & pop” lenders… eventually largely outcompeted by modern banks/credit cards… – Examples: • Empire City Personal Loan Company – Founded 1932, Dissolved 1938 • American Credit Company » Renamed “Liberty Loan Company” in 1942 – Founded 1902, Dissolved 1964 – Branch office in 1947; dissolved in 1955 – Branch office in 1955; censored in 1965. Short Paper Assignment • Licensed lenders dataset – Unit of analysis: Organization • Branch offices each have an independent government license, are treated as fully separate organizations – Data structure: • Annual data set – Time-series / “Long form”, split-spell data – Outcome of interest: Organizational mortality • When the organization dies/dissolves/shuts down – Rudimentary independent variables included… Short Paper Assignment • Project goals: – 1. Test a series of hypotheses (which I provide) using EHA models – 2. Run some simple EHA diagnostics • Check proportionality assumption for one X var • Check for outliers using residuals – 3. Write up results (4-5 pages) • Like the methods/results section of a short journal article… Accelerated Failure Time Models • We’ve been modeling the hazard rate: h(t) • Most parametric approaches build on Cox strategy… • An alternative approach: model log time • Using parametric approach like exponential or Weibull • Focus is time rather than hazard rate: ln( t ) X • Where last term “e” is assumed to have a distribution that defines the model (e.g., making it Weibull) – Recall: odd distrubution of e is the problem with OLS – What if we introduced a complex parameter here! Accelerated Failure Time Models • Cleves et al. 2004: AFT (or “log time) models aren’t actually new kinds of models • Rather, they are re-expressing the same models in a different metric… • Instead of expressing effects on hazard rate, coefficients reflect effect on log time to event • Instead of “hazard ratios” you can compute “time ratios” – Substantive emphasis is on TIME to event • This can be desirable… more concrete than haz rates – Issue: coefficients have opposite signs!!! • A variable that increases hazard rate will decrease time to event. Proportional Hazard vs. AFT • Blossfeld data: Upward employment moves . streg edu coho2 coho3 lfx pnoj pres if pres <=65, dist(exponential) nohr Exponential regression -- log relative-hazard form No. of subjects = No. of failures = Time at risk = Log likelihood = 591 84 40161 -253.68509 Log relative hazard = Number Proportional of obs = hazards 591 model LR chi2(6) Prob > chi2 = = 131.39 0.0000 -----------------------------------------------------------------------------_t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------edu | .3020663 .0429622 7.03 0.000 .2178619 .3862708 coho2 | .6366232 .2713856 2.35 0.019 .1047172 1.168529 coho3 | .7340517 .2766077 2.65 0.008 .1919105 1.276193 lfx | -.0022632 .0020781 -1.09 0.276 -.0063363 .0018098 pnoj | .1734636 .1003787 1.73 0.084 -.0232751 .3702022 pres | -.143771 .0142008 -10.12 0.000 -.171604 -.115938 _cons | -5.116249 .6197422 -8.26 0.000 -6.330922 -3.901577 ------------------------------------------------------------------------------ Proportional Hazard vs. AFT • Blossfeld data: Upward employment moves . streg edu coho2 coho3 lfx pnoj pres if pres <=65, dist(exponential) nohr time Exponential regression -- accelerated failure-time form No. of subjects = No. of failures = Time at risk = Log likelihood = 591 84 40161 -253.68509 Number Streg option “time” specifies of obs =AFT form 591 LR chi2(6) Prob > chi2 = = 131.39 0.0000 -----------------------------------------------------------------------------_t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------edu | -.3020663 .0429622 -7.03 0.000 -.3862708 -.2178619 coho2 | -.6366232 .2713856 -2.35 0.019 -1.168529 -.1047172 coho3 | -.7340517 .2766077 -2.65 0.008 -1.276193 -.1919105 lfx | .0022632 .0020781 1.09 0.276 -.0018098 .0063363 pnoj | -.1734636 .1003787 -1.73 0.084 -.3702022 .0232751 pres | .143771 .0142008 10.12 0.000 .115938 .171604 _cons | 5.116249 .6197422 8.26 0.000 3.901577 6.330922 ------------------------------------------------------------------------------ Note that log likelihood and T/Z values are the same. However, all signs are opposite & in a different scale. Proportional vs. AFT metric • Weibull models: Here, coefficients differ… . streg edu coho2 coho3 lfx pnoj pres if pres <=65, dist(weibull) nohr Weibull regression -- log relative-hazard form _t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------edu | .3004217 .0438282 6.85 0.000 .2145201 .3863234 coho2 | .6259013 .2775622 2.25 0.024 .0818895 1.169913 coho3 | .7189294 .2886739 2.49 0.013 .1531389 1.28472 lfx | -.0022896 .0020818 -1.10 0.271 -.0063698 .0017906 pnoj | .1719096 .1007356 1.71 0.088 -.0255286 .3693478 pres | -.1430822 .0146639 -9.76 0.000 -.171823 -.1143414 _cons | -5.043614 .7361298 -6.85 0.000 -6.486402 -3.600826 . streg edu coho2 coho3 lfx pnoj pres if pres <=65, dist(weibull) nohr time Weibull regression -- accelerated failure-time form _t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------edu | -.3048278 .046158 -6.60 0.000 -.3952959 -.2143598 coho2 | -.635081 .2753596 -2.31 0.021 -1.174776 -.0953861 coho3 | -.7294735 .2817224 -2.59 0.010 -1.281639 -.1773078 lfx | .0023232 .0021333 1.09 0.276 -.0018581 .0065045 pnoj | -.1744309 .1019852 -1.71 0.087 -.3743182 .0254564 pres | .1451807 .0163841 8.86 0.000 .1130684 .1772929 _cons | 5.117586 .6280134 8.15 0.000 3.886702 6.348469 Accelerated Failure Time Models • Remarks: – 1. AFT models are less common, but you’ll run across them occasionally – 2. It is important to recognize them… • Because coefficient interpretations are opposite! – 3. STATA currently offers more parametric options for AFT models • Log-logistic and log-normal are only available in AFT • These are non-monotonic curves, might be useful… – So, you might consider them if you are having trouble with model fit. Parametric Models & Predictions • Parametric models allow prediction of failure times for all cases • Whether using proportional hazard or AFT metric – Strategy: run model, then use “predict” command – Issues: • 1. You have many prediction options… – “Mean” estimated time; Median estimated time (+ log options) • 2. If you have split-spell data, you’ll get a prediction for EACH record in the data – Predictions take into account X variables – As X variables change, predicted time changes, too! Predicted Times • Blossfeld job data (upward moves) . list id duration event sex time mdtime 1. 2. 3. 4. 14. 20. 21. 29. 30. 31. 37. 38. 39. 40. 41. 42. 43. 44. +----------------------------------------------------+ | id duration event sex time mdtime | |----------------------------------------------------| | 1 427 0 1 130.2342 90.27149 | | 2 45 1 2 192.2021 133.2243 | | 2 33 0 2 5651.612 3917.399 | | 2 219 0 2 5131.651 3556.99 | | 6 25 1 1 205.6662 142.557 | | 7 5 1 2 116.0007 80.40555 | | 7 14 0 2 416.3065 288.5616 | | 10 120 1 1 690.877 478.8794 | | 10 141 1 1 2412.739 1672.383 | | 10 120 0 1 21855.97 15149.41 | | 12 27 1 1 92.27634 63.96109 | | 12 70 0 1 2605.027 1805.667 | | 13 38 0 2 774.3403 536.7318 | | 13 101 0 2 1094.581 758.7059 | | 14 35 0 2 579.2303 401.4919 | | 14 86 0 2 528.3259 366.2076 | | 15 11 0 1 1612.258 1117.532 | | 15 11 1 1 139.5957 96.76038 | Predicted median time is 80 months, actual upward move occurred in 5 months… Model really doesn’t expect this case to have an upward job transition… Parametric Models & Predictions • Useful things you can do with predictions: – 1. Highlight some examples to give your reader a concrete sense of event timing… – 2. Construct predictions that reflect different values of X variables • Ex: Run model. Make predictions. Recode Xs. Make further predictions – Example: How would the predicted time-to-event change if case was male, rather than female – Ex: Environmental treaties: What is predicted time to treaty signing if democracy were 10 rather than 1? • Vividly illustrates coefficient effects. Residuals – Summary • From Cleves et al. (2004) An Introduction to Survival Analysis Using Stata, p. 184: • 1. Cox-Snell residuals • … are useful for assessing overall model fit • 2. Martingale residuals • Are useful in determining the functional form of the covariates to be included in the model • 3. Schoenfeld residuals (scaled & unscaled), score residuals, and efficient score residuals • Are useful for checking & testing the proportional hazard assumption, examining leverage points, and identifying outliers • NOTE: A residual is produced for each independent variable… • 4. Deviance residuals • Are useful fin examining model accuracy and identifying outliers. Cox-Snell Residuals • Cox-Snell residuals for case i: CSresid i Hˆ 0 (t ) exp( ˆX i ) • Where H(t)-hat is the estimate of the cumulative hazard – Based on model results • B-hats are estimates from the model • Xi are values for each case in your data – Interpretation: “The expected number of events in a given time-interval” – Box-Steffensmeier & Jones 2004. Cox-Snell residuals: Model Fit • Cox-Snell residuals can be plotted to assess model fit • If model fits well, graph of integrated (cumulative) hazard conditional on Cox-Snell residuals vs. Cox-Snell residuals will fall on a line – Strategy in stata: • • • • • Run Cox model, request martingale residuals Use “predict” to compute Cox-Snell residuals Stset your data again, with Cox-Snell as time variable Compute integrated hazard Graph integrated hazard versus residuals. Cox-Snell residuals: Model Fit • Cox-Snell residuals can be plotted to assess model fit • If model fits well, graph of integrated (cumulative) hazard conditional on Cox-Snell residuals vs. Cox-Snell residuals will fall on a line – Strategy in stata: • • • • • Run Cox model, request martingale residuals Use “predict” to compute Cox-Snell residuals Stset your data again, with Cox-Snell as time variable Compute integrated hazard Graph integrated hazard versus residuals. Cox-Snell Model Fit Example 2 • Cox-Snell Plot for Environmental Law data Note: Don’t worry much about deviations from the line at the right edge of the plot. There are typically few cases there… 0 .5 1 1.5 This looks quite bad. Cumulative hazard should fall on the line… Instead, there is a sizable gap. 0 .2 .4 partial Cox-Snell residual Nelson-Aalen cumulative hazard partial Cox-Snell residual .6 Martingale Residuals • Martingale residuals: More intuitive… • Difference between observed event (vs. censored) and expected number of events a case is predicted to have – Based on hazard rate given X vars… • Martingale residuals range from –infinity to +1 – Often very skewed – Deviance residuals: Normalized version of martingale residuals. MG Residuals and Functional Form • Issue: What functional form of independent variables should you choose? • Ex: Should you log your independent variables? – Skewness is one consideration; but you also want to specify the correct relationship between vars… – In OLS regression we can plot X vars versus residuals to identify departures from linearity • In EHA, we can do something similar: • Estimate Cox model without covariates, save martingale residuals • Use “lowess” command to plot mean residuals versus X variables • Functional form that is closest to a flat line = best. MG Residuals and Functional Form • Stata code: * * Use Martingale Residuals to check functional form * stset tf, fail(des) * Estimate a cox model with NO covariates * -- option "estimate" makes this happen * Plus, create a new variable "mg" containing * Martingale residuals stcox , mgale(mg) estimate * Next, plot residuals versus different transformations * of your X variables (with smoothed mean – lowess) lowess mg lfx lowess mg lfxcubed lowess mg loglfx Martingale Functional Form Example • Blossfeld employment termination data • Should labor force experience be raw, logged, cubed? 1 Lowess smoother 0 Labor force experience is CUBED… -1 Note the SHARP curve near zero… Very non-linear -2 This is really bad. 0 bandwidth = .8 2.00e+07 4.00e+07 lfxcubed 6.00e+07 8.00e+07 Martingale Functional Form Example • Blossfeld employment termination data • Should labor force experience be raw, logged, cubed? Lowess smoother 1 This is RAW labor force experience -2 -1 0 Not bad… close to a flat line. 0 bandwidth = .8 100 200 lfx 300 400 Martingale Functional Form Example • Blossfeld employment termination data • Should labor force experience be raw, logged, cubed? Lowess smoother 1 Labor force experience, logged -2 -1 0 This is the best yet… but not a big difference from raw… 0 2 4 loglfx bandwidth = .8 6 Discussion: Empirical Example • Soule, Sarah A and Susan Olzak. 2004. “When Do Movements Matter? The Politics of Contingency and the Equal Rights Amendment.” American Sociological Review, Vol. 69, No. 4. (Aug., 2004), pp. 473-497.