STAT 557 ASSIGNMENT 6 Name ________________ Reading Assignment: Lloyd, Chapter 4 Written Assignment: On-campus: Due Friday, November 10 Off-campus: Due Friday, November 17 Second exam: The second exam is a take home exam. It will be distributed in class on Friday, November 10, and will be due on Friday, November 17. Additional information will be sent to distance students. 1. The following table shows numbers of colonies of TA98 Salmonella observed on plates treated with five levels of quinoline. Three plates were prepared for each level of quinoline. log10(dose) of quinoline (mg per plate) 0 1.0 1.5 2.0 2.5 Numbers of colonies of TA98 Salmonella 12 17 25 21 26 28 26 26 35 28 33 50 34 45 47 Let Yij denote the count for the j-th plate prepared with the i-th level of quinoline. Suppose the Yij’s are independent Poisson random variables with E(Yij) = mi, where log(mi) = β0 + β1 X and X is the log(dose) value for the corresponding level of quinoline. (a) Write down the formula for the log-likelihood function for this model. (b) Find the maximum likelihood estimates for β0 and β1. Give an interpretation of these estimated coefficients. (c) Construct a 95% confidence interval for the mean number of colonies when the logdose of quinoline is 2.2. (d) Test the fit of the model against the alternative of independent Poisson counts with a different mean at each level of quinoline. Report a p-value and state your conclusion. (e) Now fit the model log(mi) = β0 + β1X with mi = E(Yi) and where Yi has a negative binomial distribution. Report estimates of β0, β1 and the dispersion parameter, and report their standard errors. Does the estimate of the dispersion parameter indicate that there is more variation in the counts than can be accounted for by a Poisson regression model? Explain. 2 (f) Use the model in Part (e) to construct a 95% confidence interval for the mean number of colonies when the log-dose of quinoline is 2.2. Compare this with your answer to Part (c). (g) (Extra Credit) Explore adding higher order polynomial terms to the original Poisson regression model and the model in Part (e). What is the “best” model? Use your estimate of the “best” model to construct a 95% confidence interval for the mean number of colonies when the log-dose of quinoline is 2.2. 2. The data in the following table give the numbers of litters of mice with no deaths and at least one death within each of 10 categories formed by litter size and whether the mother was exposed to treatment A or treatment B during pregnancy. The objective is to determine how litter size and pre-natal exposure to treatment A or treatment B affect the probability of at least one death. Number of Deaths None At Least 1 58 16 75 26 Litter Size 7 Treatment A B 8 A B 49 58 24 25 9 A B 33 45 33 32 10 A B 15 39 28 40 11 A B 4 5 29 23 (a) Fit separate logistic regression models for treatments A and B to model the probability of at least one death in a litter as a function of litter size, i.e., fit the model πij = αi + β i X log 1 − πij i = 1, 2 j = 1, . . . , 5 where πij is the probability of at least one death in a litter, α1 and β1 correspond to treatment A, α2 and β2 correspond to treatment B and X is the litter size. Report values for parameter estimates and standard errors, and provide interpretations of the parameter estimates with respect to how litter size and treatments affect the probability of at least one death in a litter. (b) Plot the estimated curves for πij against X. Put both curves on the same graph. Also plot the observed proportions Pij = Yij/nij against X on the same plot. Use different 3 plotting symbols to distinguish treatment A from treatment B. Attach the plot to this assignment. Does the plot indicate any deficiency in the model? Explain? (c) Fit a logistic regression model with the same slope for treatments A and B, i.e., πij = α + βX log 1 − π ij i = 1, 2 j = 1, . . . , 5 Report m.l.e.’s for α and β and the standard errors. (d) Compute a deviance test of the fit of the model in Part (c) against the model in Part (a). Report the value of a test statistic, degrees of freedom and a p-value. State your conclusion. (Note that this could be a comparison of two models, neither of which is appropriate for these data.) (e) You could also consider altering the link function or changing the way that litter size is entered into the model. Explore this and report estimates for parameters in the “best” model you can find. Provide an interpretation of your final model with respect to how the treatments and litter size affect the probability of at least one death in a litter.