Modeling Diabetic Hospitalizations for the TennCare Population Application of Predictive Modeling for Care Management Panel AcademyHealth Annual Research Meeting June 28, 2005 Boston Avery Ashby MS Soyal Momin MS, MBA Raymond Phillippi PhD Allen Naidoo PhD Judy Slagle RN, MPA Background Management Programs • BlueCross BlueShield of Tennessee provides care management programs for members with certain chronic illnesses or conditions. • Care managers are licensed nurses. • Diabetes is a prevalent chronic illness affecting our managed TennCare population. • Modeling of diabetic inpatient hospitalizations can help in identifying and directing those members at higher risk to care management. Methodology Study Design • Diabetic members were identified using member level claims data. • Data were collected for continuously enrolled diabetic members for the time period of July 1, 2001 through June 30, 2003. • Year 1 member specific data were used to model whether a diabetic hospitalization occurred in Year 2. • Logistic regression was employed to model the probability of a diabetic hospitalization in Year 2. Time Period Year 1 Year 2 July 1, 2001 – June 30, 2002 July 1, 2002 – June 30, 2003 Member Specific Data Diabetic Hospitalization? Data Elements Demographics • Gender • Age • Zip Code Metropolitan & Rural • Region Multiple Regions • Eligibility Medicaid subcategories not including dual-eligible members Utilization • Diabetic Hospitalizations • Emergency Room Encounters • Ophthalmologist Encounters • Primary Care Physician (PCP) Encounters • Endocrinologist Encounters • Total Specialist Encounters Pharmacy • Insulin Prescriptions Prescribed or Not • Misc. Anti-diabetic Prescriptions Prescribed or Not • Sulfonylurea Prescriptions Prescribed or Not • Caloric Agents Prescribed or Not • Total Prescriptions (Any variety) Evidence Based Guidelines • Cholesterol Screening Received or Not • Eye Examination Received or Not • Microalbuminuria Screening Received or Not • HbA1c Screening Received or Not Diagnosis and Risk Score • Insulin Dependency Dependent or Not • Total Co-morbidities • Diagnostic Cost Grouper (DCG) Risk Score General Data Characteristics • Members: 11,002 (313 Year 2 Hospitalizations) • Gender: Female 64.7% • Age: Mean 47 Median 50 Predictive Model Model Specifics • Probability of hospitalization = 1/(1+e-z) Where z = -2.160 + ( 1.164 * Diabetic Hospitalizations) – ( 0.328 * No Insulin prescribed) – ( 0.038 * Age) + ( 0.092 * Diagnostic Cost Grouper Risk Score) + ( 0.199 * No Misc. Anti-diabetic prescribed) + ( 0.208 * Ophthalmologist Encounters) – ( 0.015 * Primary Care Physician Encounters) – ( 0.361 * Non-Insulin Dependent) + ( 0.054 * Emergency Room Encounters) – ( 0.031 * Total Specialist Encounters) Sensitivity vs. Specificity Receiver Operating Characteristic (ROC) Curve Area Under the Curve (AUC) = 0.830 1 0.9 True Positive Rate 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 False Positive Rate 0.7 0.8 0.9 1 Odds Ratio Estimates Model Specifics Covariate Odds Ratio Lower Limit Upper Limit Diabetic Hospitalizations 3.203 2.541 4.039 0.519 0.275 0.982 Age 0.963 0.955 0.971 Diagnostic Cost Grouper Risk Score 1.096 1.077 1.116 1.489 1.154 1.922 Ophthalmologist Encounters 1.232 1.101 1.377 Primary Care Physician Encounters 0.985 0.970 0.999 0.486 0.254 0.929 Emergency Room Encounters 1.056 1.026 1.086 Total Specialist Encounters 0.969 0.951 0.987 Insulin Prescribed Misc. Anti-diabetic Prescribed Insulin Dependency No vs. Yes No vs. Yes No vs. Yes Diagnostics Covariate Tolerances* Diabetic Hospitalizations 0.92 Insulin Prescriptions 0.19 Age 0.91 Diagnostic Cost Grouper Risk Score 0.66 Anti-diabetic Prescriptions 0.94 Ophthalmologist Encounters 0.94 Primary Care Physician Encounters 0.71 Insulin Dependency 0.19 Emergency Room Encounters 0.54 Total Specialist Encounters 0.40 *Tolerance is 1- R2x, where R2x is the variance in each covariate, X, explained by all of the other covariates. Goodness of Fit Hosmer-Lemeshow (H-L) Hospitalization No Hospitalization Decile Observed Predicted Observed Predicted 1 7 4.69 1,117 1,119.31 2 4 6.34 1,098 1,095.66 3 9 7.89 1,103 1,104.11 4 7 9.65 1,114 1,111.35 5 6 11.61 1,084 1,078.39 6 16 15.08 1,084 1,084.92 7 16 20.58 1,082 1,077.42 8 28 29.45 1,073 1,071.55 9 51 47.18 1,051 1,054.82 10 169 160.04 883 891.96 Chi-square Significance Total 1,124 1,102 1,112 1,121 1,090 1,100 1,098 1,101 1,102 1,052 7.716 0.4617 Model Performance Prediction Stay No Stay Totals Actual Stay No Stay 38 34 275 10,655 10,689 313 Totals 72 10,930 11,002 Correct Prediction Rate 97.2% Sensitivity 12.1% Specificity 99.7% Positive Predictive Value (PPV) 52.8% Negative Predictive Value (NPV) 97.5% Pseudo-R2 0.223 Rational Artificial Intelligence Initial RAI Results • An artificial Neural Network (ANN) was trained and validated on the entire data set. • Problematic because the ANN tried to maximize the overall correct prediction rate. • Similar results to logistic regression models. RAI Model Performance Prediction Stay No Stay Totals Actual Stay No Stay 34 7 279 10,682 313 10,689 Totals 41 10,961 11,002 Correct Prediction Rate 97.4% Sensitivity 10.9% Specificity 99.9% Positive Predictive Value (PPV) 82.9% Negative Predictive Value (NPV) 97.5% Pseudo-R2 N/A Forced Learning Solution • Collect equal samples from hospitalized and non-hospitalized members. • Build ANN based on this 1:1 (150:150) training data set. • Validate ANN on remaining Out-of-Sample members. • Repeat process to ensure that the overall pattern is accounted for. • Develop credibility intervals for sensitivity, specificity, PPV, and NPV based on this repeated process. Forced Learning Model Performance • Results of repeated forced learning method were collected. • 95% credibility intervals were derived from MCMC simulation using WinBUGS 1.4. Sensitivity Specificity Positive Predictive Value (PPV) [66.00%,70.80%] [76.06%,78.13%] [4.11%,4.49%] Negative Predictive Value (NPV) [98.36%,98.73%] Research Implications Finding a Balance • Begins with the question of allocated resources. • Logistic regression model and ANN identified a small percentage of members with an actual Year 2 hospitalization with a “reasonable” PPV. • ANN using the Forced Learning Method identified a much larger percentage of members with an actual Year 2 hospitalization with a low PPV. Coverage Logistic Regression Model Predicted hospitalization No hospitalization hospitalization Forced Learning ANN Predicted hospitalization No hospitalization hospitalization Future Considerations • Other covariates like lab values, Health Risk Assessments (HRAs), and psychological indicators. • Using a meta-model where clusters of homogenous sub-groups are modeled separately [and possibly] with differing methods. • Model probability of co-morbid condition related hospitalizations instead of diabetic hospitalizations. Contact Information Avery Ashby MS Senior Research Analyst Health Intelligence Group 801 Pine Street – 3E Chattanooga, TN 37402 423.763.7482 p 423.785.8083 f avery_ashby@healthintelgroup.com