STAT 557 FALL 2000 FINAL EXAM NAME ________________ Instructions: You may use a calculator and the formula sheets you brought to this exam. No other notes or books are allowed. Write your answers in the spaces provided below. If you need more space use the back of the page or attach additional sheets of paper, but clearly indicate where this is done. You need not complete numerical computations, you will receive complete credit by showing that you know how to solve the problem. Be sure to define any notation you use that is not defined in the statement of a problem. 1. To determine if the incidence rates of a particular bad side effect are different for two anesthetics, patients undergoing a certain surgical procedure were randomly assigned to one of the two anesthetics, labeled A or B. Each patient was classified according to whether or not they experienced the particular side effect. One of the 10 patients who received anesthetic A experienced the side effect, and three of the 8 patients who received anesthetic B experienced the side effect. Describe an appropriate test procedure and show how to compute the p-value. 2 2. A company that sells its products by mailing offers to potential customers currently has about a 4% response rate. It has been purchasing address lists for potential costumers from supplier A. Supplier B claims that they can provide lists of addresses for potential customers that will generate a higher response rate. To investigate this claim, the company will mail offers to a random sample of n addresses provided by supplier A, and they will mail the same offers to a random sample of n addresses provided by supplier B. How big should n be? The company would like to have a good chance (at least 80%) of showing a difference at the .05 level of significance if addresses from Supplier B can increase their response rate by at least one percentage point. Display the formula you would use to determine n. 3. To study the potential benefit of a new drug for treating a breathing condition, 400 subjects who suffered from the breathing condition were randomly divided into two groups of equal size. Subjects in one group received the drug in pill form, and subjects in the other group received a placebo (a pill that did not contain the drug). Each subject took two pills each day for six weeks. Each of the subjects used an inhaler to help relieve the effects of the breathing condition. At the beginning of the study each subject was examined and classified as either a heavy or a moderate inhaler user. At the end of the six week study period each subject was re-examined and classified as either a heavy or moderate inhaler user. The placebo group was included in this study because changes in weather and other environmental conditions during the study period could have an impact on changes in inhaler use. Show how you would determine if the drug was effective in reducing inhaler use. 3 4. In a study of a chemotherapy treatment for leukemia, the result for each of 170 treated patients was coded as 1 for remission 0 for no remission The covariates are X1 = percentage of cells undergoing DNA synthesis in the presence of chemotherapy X 2 = highest recorded patient temperature (°F) prior to the start of the chemotherapy treatment Data for 12 of the patients are given in the following table. Patient 1 2 3 4 5 6 7 8 9 10 11 12 Result X1 X2 0 1 0 0 0 1 0 0 1 0 1 1 0.11 0.19 0.05 0.10 0.06 0.11 0.04 0.06 0.10 0.16 0.17 0.09 99.0 101.4 102.0 100.4 99.0 98.6 101.0 102.0 100.2 98.8 98.6 98.6 The researchers fit the following model to these data: π log i = β 0 + β1 X1i + β 2 X 2i 1 − πi where π i is the conditional probability of remission given the values of (X1i, X2i). They assumed that each leukemia patient responded independently of any other patient. (a) Write out a formula for the likelihood function they maximized to obtain maximum likelihood estimates for the parameters in this model. 4 (b) The values of maximum likelihood estimates for the parameters and the estimated covariance matrix are as follows: βˆ 0 45.4 β = βˆ1 = 33.0 ~ ˆ - 0.5 β 2 ∧ 2193.92 V ∧ = 294.89 β ~ - 22.43 ∧ 294.89 186.29 - 3.18 - 22.43 - 3.18 0.23 Use these results to obtain the maximum likelihood estimate of the value of X1 needed to achieve a 0.90 probability of remission when X2 = 100.0°F. (c) Show how to compute a standard error for the estimate in part (b). (d) Describe the steps you would take to assess the fit of the proposed model, and, if necessary, find a better model. 5 5. In a study of the relationships between car size and severity of accident injuries, a simple random sample of n= 1200 accident reports was selected, without replacement, from a file of over 2 million automobile accident reports for maintained by the State of California. The sampled records were classified into a 4×3×2×2 contingency table with respect to the levels of the following four factors: Accident type (T): (i=1) (i=2) (i=3) (i=4) Collision with another vehicle, no rollover Collision with an object, no rollover Rollover with no collision Rollover involving a collision Accident Severity for the driver (S): (j=1) Not severe (j=2) Moderately severe (j=3) Severe Car Size (C): (k=1) Compact (smaller cars) (k=2) Standard (larger cars) Ejection of driver from the vehicle (E): ( l =1) No ( l =2) Yes In answering the following questions, let π ijkl denote the probability that a randomly selected accident record is classified as the i-th accident type, the j-th level of severity for the driver, the k-th car size, and the l -th ejection category. Let mijkl and Yijkl denote the corresponding expected and observed counts, respectively. A. Write out the formula for the largest (least parsimonious) log-linear model that satisfies the following null hypothesis: H 0 : Given the type of accident (T), the accident severity for the driver is conditionally independent of both car size (C) and whether or not the driver is ejected (E). 6 B. Consider the log-linear model TC log( m ijkl ) = λ + λTi + λSj + λCk + λEl + λTS ij + λ ik SC SE TSC + λTE il + λ jk + λ jl + λ ijk E where λT1 = λS1 = λC 1 = λ1 = 0 and any interaction parameter is constrained to be zero when any factor involved in the interaction is at its lowest level. Maximum likelihood estimates of the parameters and their standard errors are shown in the table on the next page. Use this information to answer the following questions. If you do not have enough information to complete an answer, describe the formula or method you would use and the additional information that you would need to complete the answer. (i) The value of the Pearson chi-square test statistic for testing the fit of this model against the general alternative is 20.65. All of the estimated expected counts are larger then 5.8. What are the degrees of freedom for the chi-square approximation to the null distribution of this test statistic? (ii) Compute the maximum likelihood estimate for the odds ratio corresponding to the odds that a driver who is ejected from the vehicle is severely injured divided by the odds that a driver who is not ejected from the vehicle is severely injured. (iii) Show how to construct a 95% confidence interval for the odds ratio in part (ii). 7 Estimate -------3.666 Standard Error --------0.029 Estimate / (Std. Error) ---------126.86 -0.324 0.040 -8.09 λT3 -0.186 0.024 -7.65 λT4 -0.033 0.016 -2.00 λS2 0.078 0.032 2.4 -0.218 0.023 -9.79 0.742 0.026 28.61 -0.673 0.023 -29.12 0.073 0.030 2.45 0.230 0.022 10.49 0.177 0.017 10.21 -0.014 0.029 -0.49 0.003 0.020 0.16 0.075 0.012 6.32 0.007 0.033 0.22 -0.166 0.023 -7.35 -0.015 0.015 -1.00 0.062 0.031 2.01 0.253 0.021 12.52 0.094 0.013 7.29 0.057 0.026 2.18 0.051 0.021 2.42 λSE 22 0.179 0.027 6.60 λSE 32 0.206 0.017 12.29 λTSC 222 -0.076 0.030 -2.58 λTSC 322 0.041 0.022 1.93 λTSC 422 -0.024 0.017 -1.40 0.013 0.028 0.47 0.016 0.019 0.86 0.012 0.012 1.07 λ λT2 λS3 λC2 λE2 λTS 22 TS λ 32 λTS 42 TS λ 23 λTS 33 TS λ 43 λTC 22 TC λ 32 λTC 42 TE λ 22 λTE 32 TE λ 42 λSC 22 SC λ 32 λTSC 232 TSC λ 332 λTSC 432 8 (iv) What do the parameter estimates for this model imply about associations between severity of driver injuries and car size and accident type? C. Show how you could express the model in part B as a logistic regression model. EXAM SCORE_____________ COURSE GRADE __________