Persistence of HMO Performance: Are Good HMOs Always Good and Bad HMOs Always Bad? Shailender Swaminathan University of Alabama at Birmingham Mike Chernew – University of Michigan Dennis Scanlon – Penn State University This research was funded by a grant from the Agency for Healthcare Research & Quality [P01-HS10771] Introduction • Last 10-15 years have witnessed growth in quality/performance measures in healthcare – Hospitals, physicians, health insurance plans • Increase in demand for such measures from payers, consumers, and regulators – Concerns about cost, value, quality and safety • Utility of these measures depends on the ability of users to process multiple measures for purposes of decision making and evaluation Two Key Challenges for Performance Measure Use • There are often multiple indicators of quality for a given provider (e.g., hospital or doctor) or health plan, making it challenging to summarize overall performance – Need to find a way to aggregate measures, assuming each provides some signal about quality but also contains some noise • Measures of performance are based on data gathered before decisions are made, and the decisions themselves generally do not take effect for some time until after the decision date (e.g., 2003 data for 2005 plan enrollment that takes effect in 2006) – The utility of past data for predicting future performance depends on the stability of plan performance Typical Financial Product Disclaimer • “Past performance does not guarantee future results” – Should a consumer, purchaser/payer believe the same thing for health plans and health providers? Longitudinal Variation: MMR Rate 100 percent 95 90 85 80 1998 1999 2000 y ear MMR Rate 2001 2002 Unadjusted Transition Probabilities (Absolute Rankings based on mean in 2000 and +/- one std dev - % and N) Probability of rating in top third in year (t+1) DTP Rate MMR Rate OPV Rate HIB Rate Hepatitis B Rate VZV Rate Rating in t-1 Rating in t Above Upper threshold Above Upper threshold 0.75 (3) 0.88(24) 0.78 (23) 1.0 (3) 0.5 (2) . In middle range Above Upper threshold 0.78(9) 0.53 (19) 0.75 (6) 0.8 (10) 0.6 (5) . Below lower threshold Above Upper threshold 0.0 (2) . 0 (2) 0.0 (2) 0.0 (2) 0.0 (2) ALL Above Upper threshold 0.67(15) 0.72(42) 0.73 (33) 0.73 (15) 0.44 (9) 0.0 (2) In middle range In middle range 0.03 (92) 0.11(99) 0.12 (93) 0.09 (99) 0.03 (87) 0.14 (7) Below lower threshold In middle range 0.0 (20) 0.0(4) 0.0 (12) 0.0 (20) 0.04 (23) 0.0 (26) ALL In middle range 0.02 (121) 0.12 (114) 0.12 (117) 0.07 (129) 0.04 (112) 0.03 (33) Below lower threshold Below lower threshold 0.0 (15) . 0 (2) 0.0 (5) 0 (23) 0 (121) Research Objectives • Develop a methodology to estimate longitudinal “transition probabilities” when multiple indicators of performance exist • Estimate aggregate transition probabilities using multiple HEDIS childhood immunization measures • Incorporate the effects of measured covariates on the transition probabilities Empirical Model • There are 6 HEDIS childhood immunization measures so the model is: Chicken Pox (VZV) Rate it 01 11 u1it u H Influenza Type B (HIB) Rate 02 12 it 2it Measles, Mumps & Rubella (MMR) Rate 03 13 u3it * it =Y ' Hybrid * q (1) jit jit it Hepatitis B Rate it 04 14 u4it Diphtheria, Tetanus & Pertussis (DTP) Rate 05 15 u5it it 06 16 u6it Polio (OPV) Rate it • Latent quality is a function of both time invariant and time varying covariates and is written as: qit* 0 1 ' X it 2 ' Dt i i (t t ) it (2) Empirical Model • The MIMIC model relates the indicators to the underlying latent variable but we assume that the indicator specific errors (u’s) are uncorrelated both contemporaneously and over time. • X’s are measured covariates such as profit status and MSA managed care penetration • D are time period specific dummy variables • The term i represents unmeasured heterogeneity in quality levels while i represents unmeasured heterogeneity in growth rates, and it is an independently distributed error term Using Model Parameters to Estimate Transition Probabilities • We simulate joint probabilities over multiple periods (e.g., the joint probability that plans were in the bottom third in 1998 and 1999 is: 1,98 - 0 1 X 98 1,99 - 0 1 X 99 P q98 1,98 , q99 1,99 , | 12 11 22 • Transition probabilities can be simulated for transitions involving more than 2 periods (e.g., we can find the probability that plans are in the upper third in 2000 given that they were in the middle third in 1999 and in the bottom third in 1998) – We use the Geweke, Hajivassiliou and Keane (GHK) simulator (Geweke, Keane and Runkle 1994). (5) Data • Data sources – NCQA’s HEDIS (1998-2002) • We use 6 indicators of childhood immunization • We use data from both public reporting and nonpublicly reporting plans and we control for the data collection method used – Interstudy MSA Profiler and Competitive Edge • Plan and market characteristics Summary Statistics: Distribution of Years of Data Per Measure (N=527) % of Plans Childhood Immunization Measures DTP Rate MMR Rate OPV Rate HIB Rate Hepatitis B Rate VZV Rate 1 year 2 years 3 years 4 years 5 years 30% 28% 17% 17% 13% 14% 15% 13% 25% 28% 32% 30% 30% 30% 12% 17% 17% 16% 12% 13% 13% 14% 16% 15% 15% 15% 28% 25% 25% 25% About 40 % of plans report at least 4 years of data for each measure Needed for identification of MIMIC Model Summary Statistics Table 1: Descriptive Statistics for Quality Measures Childhood Immunization Measures Childhood Immunizations – DTP Rate Childhood Immunizations – MMR Rate Childhood Immunizations – OPV Rate Childhood Immunizations – HIB Rate Childhood Immunizations – Hepatitis B Rate Childhood Immunizations – VZV Rate 1998 (N=357) Mean Std Dev 75.93 14.93 1999 (N=334) 2000 (N=329) Mean Std Mean Std Dev Dev 78.87 12.95 80.79 12.07 2001 (N=305) Mean Std Dev 2002 (N=290) Mean Std Dev 81.68 10.77 80.38 11.91 85.59 9.87 87.09 8.70 88.77 6.66 89.57 6.41 90.24 5.91 81.62 14.10 82.83 11.79 84.65 11.22 85.56 9.68 86.35 10.97 78.14 13.77 80.82 11.67 82.95 10.29 83.45 9.84 83.52 10.67 71.32 17.28 75.34 14.29 78.04 13.81 80.19 11.82 82.17 12.63 52.06 11.72 64.01 10.85 70.76 9.43 75.29 9.21 82.37 7.94 Results: Effect of Covariates on Quality Variable 0 <= Linear Spline Plan Age <8 8 <= Linear Spline Plan Age <12 12> Linear Spline Plan Age For profit MSA HMO Penetration (weighted) Herfindahl Index (weighted) % of MSA Population Non-white MSA Per Capita Income (weighted) Staff Group Model IPA Network Model MSA Population Time Dummies (Reference = 1998) Dummy Variable Year = 1999 Dummy Variable Year = 2000 Dummy Variable Year = 2001 Dummy Variable Year = 2002 Estimate (Asymptotic standard errors) 1.0844 *** (0.3500) 0.5279 (0.4121) 0.1338 (0.1273) -4.2460 *** (1.1837) 9.8481 ** (3.9232) 0.1427 (1.7316) -20.4495 *** (6.0307) 0.3207 ** (0.1508) 5.6873 ** (2.8589) 0.4975 (0.8536) -0.0994 *** (0.0360) 0.8994 * 3.4587 *** 4.1045 *** 4.6947 *** (0.5413) (0.8293) (1.1153) (1.3135) Results: Variance Components Variance Components Plan specific growth component (standard deviation) Plan specific level component (standard deviation) Correlation in level and growth components Autocorrelation parameter Transitory error (standard deviation) Log Likelihood for Full Model Estimate (Asymptotic standard errors) 1.7297 *** (0.5516) 6.8656 *** (2.4185) -0.6580 * (0.3930) 0.7135 *** (0.1546) 5.5070 *** (0.3640) -25812.62 • Significant variance components for both levels and growth • Negative correlation between level and growth components • Significant AR(1) Simulated Transition Probabilities (Absolute Rankings with thresholds based on mean in 2000 and +/- one std dev - %) Probability of being above upper threshold in year: t+1 (2000) t+2 (2001) RESULTS USING ONLY 1 YEAR OF PAST PERFORMANCE Rating in t (1999) N/A N/A N/A Above Upper threshold In middle range Below lower threshold 0.66 0.53 0.12 0.00 0.13 0.00 Above Upper threshold Above Upper threshold In middle range 0.74 0.58 0.60 0.47 0.02 0.04 Below lower threshold 0.00 0.00 RESULTS USING 2 YEARs OF PAST PERFORMANCE Rating in t-1 (1998) Rating in t Above Upper threshold In middle range Below lower threshold Below lower threshold Simulated Transition Probabilities (Relative Rankings with thresholds based on top, bottom and middle tercile) Probability of being above upper threshold in Year: t+1 (2000) t+2 (2001) RESULTS USING ONLY 1 YEAR OF PAST PERFORMANCE Rating in t ALL ALL ALL Above Upper threshold In middle range Below lower threshold 0.80 0.69 0.16 0.01 0.17 0.00 0.75 0.75 0.60 0.57 0.54 0.49 0.03 0.03 RESULTS USING 2 YEARS OF PAST PERFORMANCE Rating in t-1 Rating in t Above Upper threshold In middle range Below lower threshold Below lower threshold Above Upper threshold Above Upper threshold Above Upper threshold Below lower threshold Effect of Covariates on Transition Probabilities Effect on HMO Penetration and Profit Status on Transition Probabilities Probability of being above the upper threshold in t+2 Results from base model Rating in t-1 Above Upper threshold In middle range Below lower threshold ALL Below lower threshold ALL Above Upper threshold Rating in t Above Upper threshold Above Upper threshold Above Upper threshold Above Upper threshold In middle range In middle range Below lower threshold Increase in HMO penetration 1 SD from mean Profit Not for profit 0.58 0.65 0.44 0.65 0.47 0.53 0.45 0.56 0.04 0.05 0.00 0.05 0.53 0.59 0.50 0.61 0.04 0.07 0.02 0.07 0.13 0.00 0.16 0.00 0.10 0.00 0.17 0.01 Results Summary • High plan performance in the past suggests, but does not guarantee, high performance in the future – High plan performance over multiple years is a stronger predictor of high performance in future • Plan level covariates can affect the transition probabilities but sizable unmeasured heterogeneity and autocorrelation induces an element of persistence in plan performance Policy/Practice Relevance • Many public reporting and incentive payment efforts implicitly assume that performance is stable or increasing, which may not always be the case • Consumer directed health care approaches rely on information being available, but also being useful in terms of correctly forecasting future performance • A simulation approach such as the one used here may be a useful way of forecasting future performance for purposes of informing the decisions of purchasers and consumers