D/RS 1013 Logistic Regression Some Questions Do children have a better chance of surviving a severe illness than adults? Can income, credit history, & education distinguish those who will repay a loan from those who will not? Are clients with high scores on a personality test more likely to respond to psychotherapy than clients with low scores? Can scores on a math pretest predict who will pass or fail a course? Answering these questions Linear regression? Why not? Logistic regression answers same questions as discriminant, without assumptions about the data Logistic regression expect a nonlinear relationship s shaped (sigmoidal) curve curve never below zero or above 1 predicted values interpreted as probability of group membership Logistic Curve math data, scores of 1-11 on pretest, fail = 0 pass = 1 Residuals generally small, largest in middle of curve. actual value-predicted value pretest score of 5, who passed the test, – 1(actual value) - .21(predicted value) = .79(residual or estimation error). two possible residual values for each value of predictor Different Shapes and Directions Negative Curve Assumptions outcomes on the DV are mutually exclusive and exhaustive sample size recommendations range from 10-50 cases per IV "too small sample" can lead to: – extremely high parameter estimates and standard errors – failure to converge Assumptions (cont.) either increase cases or decrease predictors large samples required for maximum likelihood estimation Testing the Overall Model "constant only" model – no IVs entered – first -2 log likelihood full model – all IVs entered – second -2 log likelihood difference is the overall "model" Chisquare, if p<.05, the model provides classification power Coefficients and Testing natural log of the odds ratio associated with the variable convert to odds by raising “e” to the B power significance of each is tested via the associated Wald statistic similar to t used to test coefficients in linear regression, p < .05 indicates that the coefficient is not zero Coefficient Interpretation interpret odds ratios, not actual coefficients, sign of the B coefficients gives us information – positive B coefficient: odds increase as predictor increases – negative B coefficient: odds decrease as predictor increases Coefficient Interpretation (cont.) take exp(B) converts coefficient to odds change in odds associated with one unit increase in predictor to see change with two unit increase in predictor – would multiply B by 2 prior to raising e to that power – would calculate e(2Bi) The Logistic Model where: Ŷi = estimated probability u = A + BX (in our math example) or more generally (multiple predictors) u = A +B1X1+B2X2+…+BkXk (k=# predictors) u e Yˆi u 1 e Applying the Model math data, constant and intercept found to be: – A=-14.79 and B=2.69, pretest score of 5, we want to find the probability of passing Converting to Odds p(target)/p(other) = .2075/.7925 = .2618 Applying the Model (cont.) Pretest score of 7, u = -14.79 + 2.69(7) = 4.04 4.04 e 56 . 826 p( pass ) Yˆi . 9827 4.04 1 e 57.826 odds are .9827/.0173 = 56.8263 Crosschecking 56.8263/.2618 = 217.03, which not coincidentally equals (within rounding error): e2(2.69) = e5.38 = 217.022, since we moved 2 units-multiply B by 2 prior to finding exp(B). Confidence Intervals for Coefficients odds ratios for coefficients presented with 95% confidence intervals if one is in the CI, coefficient is not statistically significant at the .05 level Classification Table Same idea as classification results (confusion matrix) in discriminant analysis. Overall % accuracy=N(on diagonal)/total N Sensitivity - % of target group accurately classified Specificity - % of "other group" correctly classified Final Points general procedure – fit the model – remove ns predictors – rerun – reporting only significant predictors cross-validation generate/modify model with half-test the classification with other half