MS4225: Business Research Modeling Mid-session test 8 March 2011 Question 1 Answer True/False to each of the following questions: 1) In a binary Logit model, the dependent variable is assumed to follow a logistic distribution. 2) The Hosmer-Lemeshow test compares the fitted model with the model that contains no explanatory variable but the intercept. 3) For the binary Logit model, the predicted value of the dependent variable cannot be a probability unless an explicit constraint is imposed on the model. 4) Fisher Scoring used under PROC LOGISTIC in SAS is another name for maximum likelihood. 5) Given models A and B, the log-likelihood for A is always no smaller than the loglikelihood for B provided that B is nested within A. 6) Imposing restrictions on the unknown parameters can never result in an increase in the model’s likelihood value even if the restrictions are correct. It is for this reason that the likelihood ratio test statistic cannot take on a negative value. 7) The term “asymptotic” is a statistical jargon referring to small samples. 8) For the binary Logit model, the change in the log-odds in favour of the target event due to a change in an explanatory variable is not a constant; instead, it varies across the population observations. 9) In the binary Logit model, if an estimated coefficient has a negative sign, then an increase in the corresponding explanatory variable always results in a decrease in the probability being modeled. 10) An estimated Logit model cannot be very good if the number of discordants exceeds the number of concordants. Question 2 A binary logit model has been fitted to explain the choice between a cat (Y = 1) and a dog (Y = 0) as the family pet for n = 23 households, each having either a cat or a dog. The following explanatory variables are used: X1: X2: X3: X4: = 1 if the family lives in a house, 0 if the family lives in an apartment; = the number of males minus the number of females in the household; = the average age of the household; = the gross annual income of the household (in thousands of U.S. dollars) The SAS results based on PROC LOGISTIC are given in the Appendix. 1) Write down the estimated Logit model. 2) Compute the likelihood-ratio test statistic and use it to test for the overall significance of the model. Use a 5% level of significance. 3) In terms of AIC and SBC, is the estimated model any better than the model that contains an intercept only? Explain. 4) Compute the Tau-a, Gamma, Somer’s D and c statistics. What conclusion can you draw from these statistics? 5) Compute the Generalized R2 for the above regression and interpret the result. 6) What are the values of X1, X2, X3 and X4 for the following household: A 35 year old female earning $U.S. 70,000 per year and living alone in an apartment. 7) Predict the probability of the above household having a dog as a pet. 8) Verify the odds ratio estimate of 0.804 for X3 as shown in the SAS output and interpret this estimate. Appendix Model Information Response Variable Number of Response Levels Model Optimization Technique y 2 binary logit Fisher's scoring Number of Observations Read Number of Observations Used 23 23 Response Profile Ordered Value y Total Frequency 1 2 1 0 9 14 Probability modeled is y=1. Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics Criterion AIC SC -2 Log L Intercept Only Intercept and Covariates 32.789 33.925 30.789 25.712 31.389 15.712 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq ????? 11.9009 4.5367 ? ? ? 0.0045 0.0181 0.3382 Likelihood Ratio Score Wald The LOGISTIC Procedure Analysis of Maximum Likelihood Estimates Parameter DF Estimate Standard Error Wald Chi-Square Pr > ChiSq Intercept x1 x2 x3 x4 1 1 1 1 1 7.9627 -8.9105 1.7876 -0.2181 0.0266 6.6899 4.9690 1.3850 0.1374 0.0587 1.4167 3.2156 1.6660 2.5202 0.2051 0.2339 0.0729 0.1968 0.1124 0.6506 Odds Ratio Estimates Effect x1 x2 x3 x4 Point Estimate 95% Wald Confidence Limits <0.001 5.975 0.804 1.027 <0.001 0.396 0.614 0.915 2.290 90.205 1.052 1.152 Association of Predicted Probabilities and Observed Responses Percent Concordant Percent Discordant Percent Tied Pairs 91.3 8.7 0.0 126 Somers' D Gamma Tau-a c ??? ??? ??? ???