STAT 557 Solutions to Assignment 2 Fall 2002 1. (a) Assuming a multinomial distribution for each country, the expected counts will be Under Normal Over Obese GB 77.42591 154.3160 29.73798 11.52012 CAN 123.37095 245.8881 47.38469 18.35623 USA 88.20314 175.7959 33.87733 13.12365 The test results for the null hypothesis of the homogeneous model: X2 G2 statistic 34.3963 34.9478 d.f. 6 6 p-value 0.000+ 0.000+ The null hypothesis is rejected; i.e., the distribution of women across the weight categories is not the same for all three countries. (b) The Pearson residuals (Yij m ^ ij )=m ^ ij are displayed in the following table and plot. Under Normal Over Obese GB -1.639457 -0.1059374 2.6153232 0.4360123 CAN 2.217385 0.1984510 -2.5255033 -2.4171856 USA -1.086405 -0.1354480 0.5365021 2.4502312 Prob 1(b): Residuals 2 Country 0 -2 -1 residuals 1 US Britain Canada Under Normal Over Obese Obesity The residual plot reveals a similar pattern of residuals for Great Britain and United States women: the frequencies of heavy women are underestimated while those of underweight women are overestimated under the null hypothesis of identical multinomial models for the three countries. On the other hand, in Canada, the proportion of underweight women is larger than the proportion expected under the null hypothesis while the proportions of overweight and obese women are smaller than expected under the null hypothesis. 2. (a) The use of a multinomial distribution would be appropriate if either i. a simple random sample of twins is selected with replacement from all twins born at the hospital in a specic time period, or 1 ii. if the number of twins born at the hospital is much larger than 98 and a simple random sample of twins is selected without replacement from this large population of twins, or iii. the 98 sets of twins represent all of the twins born at the hospital in the specied time period and there is an underlying biological mechanism associated with the birth of twins that makes each occurance of twins an independent event with a constant set of probabilities for the various types of twins. It is unlikely that either (i) or (ii) is reasonable in this case. (b) The log-likelihood function is given by `( ; Y) 3 X = log(n!) To obtain the mle under the restriction that g (; ) i=1 P3 i=1 i log(Yi !) + 3 X i=1 Yi log i : = 1, maximize 3 X = `( ; Y) + 1 i=1 ! i : Solving the likelihood equations, the mle's turn out to be or 0 = @g ; @i 0 = @g @ ^i A ^A pi i = 1; 2; 3 3 X =1 Y = i; i=1 i i = 1; 2; 3: n = (^1 A ; ^2 A ; ^3 A ) = (0:2959; 0:3673; 0:3367) (c) For model B, the probability of exactly j j 2 1 1 boys is j 1 (1 )3 j ; Then, the probability for each category will be Sex Two Boys Two Girls One Boy / One Girl for j = 1; 2; 3 prob under Model B 1B = 2 2B = (1 )2 3B = 2 (1 ) The log-likelihood for this model is `( ; Y) = log(n!) 3 X i=1 log(Yi !) + 2Y1 log + 2Y2 log(1 Solving the likelihood equation: 0= we have the mle of computed as ^ = @` @ = 2Y1 + Y3 ) + Y3 flog 2 + log + log(1 2Y2 + Y3 1 2Y1 + Y3 total number of boys = 2(Y1 + Y2 + Y3 ) total number of children = 0:4643: and the mle of is ^B = (^1 B ; ^2 B ; ^3 B ) = (^ 2 ; (1 ^)2 ; 2 2^ (1 ^)) = (0:2156; 0:2870; 0:4974): g ) : (d) The expected counts under this model are ( 21.125, 28.125, 48.750) and the results of the tests are shown in the following table. (Note that the estimated means satisfy the constraint on the observed counts 98 = n = Y1 + Y2 + Y3 = 29 + 36 + 33 = 21:125 + 28:125 + 48:750. This provides a check on your computations.) Goodness of t statistics are shown below. stat df pval G2 10.39752 1 0.00126 X2 10.22911 1 0.00138 We conclude that the model B does not t well, i.e., the sex of one twin is in some way related to the sex of the other twin. In the next part we will see that this good be a result of the fact that some twins are monozygotic. (e) For model C, 1C = P rf2 boysg = P rf2boysjmonozygotesgP rfmonozygotesg + P rf2boysjdizygotesgP rfdizygotesg = (1=2) + (1=2)2 (1 ) = (1=4)(1 + ) Similarly, we can compute 2C and 3C as shown in the following table. Sex Two Boys Two Girls One Boy / One Girl Probability under Model C 1C = (1 + )=4 2C = (1 + )=4 3C = (1 )=2 The log-likelihood under this model is written as: `( ; Y) = log(n!) 3 X i=1 log(Yi !) + Y1 logf(1 + )=4g + Y2 logf(1 + )=4g + Y3 logf(1 g )=2 : Solving the likelihood equation: 0= we have the mle of computed as and the mle of is @` @ ^ = = Y1 +1 + Y2 +1 Y1 + Y2 Y3 Y1 + Y2 + Y3 1 Y3 = 0:3265 ^ C = (^1 C ; ^2 C ; ^3 C ) = (0:3316; 0:3316; 0:3367): (f) The expected counts under this model are (32.5, 32.5, 33.0) and the results of the tests are shown in the following table. stat df pval G2 0.7553101 1 0.3848002 X2 0.7538462 1 0.3852612 There is not enough evidence to reject the null hypothesis, so Model C is consistent with the data. (g) No. Model B is a special case of Model A, and Model C is a special case of Model A, but neither Model B nor Model C is nested in the other. (h) For Model D, 1D = P rf2 boysg = P rf2boysjmonozygotesgP rfmonozygotesg + P rf2boysjdizygotesgP rfdizygotesg = + 2 (1 ) 2D = P rf2 girlsg = (1 ) + (1 )2 (1 3 ) and 3D = 2 (1 )(1 ) Since there are two functionally independent parameters (; ) and 1 + 2 + 3 = 1, the mle ^; ^ will be such that ^1D = Y1 =n = ^1A ; ^2D = Y2 =n = ^2A ; ^3D = Y3 =n = ^3A : Both Model A and Model D are saturated models, the estimated proportions are the observed proportions. Then, there is no dierence between the estimates for Models A and D, and G2 = X 2 = 0 with 0 d.f. Consequently, there is no information in the available data that can distinguish between Models A and D. One alternative would be to classify twins into 5 categories, Category monozygotic twins, 2 boys monozygotic twins, 2 girls dizygotic twins, 2 boys dizygotic twins, 2 girls dizygotic twins, 1 boy and 1 girl Count Y1 Y2 Y3 Y4 Y5 prob under Model D 1 = 2 = (1 ) 3 = (1 ) 2 4 = (1 )(1 )2 5 = 2(1 ) (1 ) The log-likelihood under this model is `(; ; Y) 5 X = log(n!) i=1 log(Yi !) + Y1 logf g + Y2 logf(1 +Y4 logf(1 )(1 )2 g+Y logf(1 ) 2 g+Y logf2(1 ) (1 ) : ) 5 3 g g Solving the likelihood equations: 0 = 0 = @` @ @` @ = = Y1 + 2Y3 + Y5 Y2 + 2Y4 + Y5 1 Y1 + Y2 Y3 + Y4 + Y5 1 we have the mle's of and computed as ^ = ^ = Y1 + Y2 number of monozygotic pairs = n number of pairs of twins Y1 + 2Y3 + Y5 Y1 + Y2 + 2(Y3 + Y4 + Y5 ) and X 2 and G2 would have 2 d.f. for testing the t of Model D against the general alternative. 3. (a) Without combining any birth weight categories, X 2 = 71:25 with d.f.=8 and the chi-square approximation to the distribution of the Pearson statistic, when the null hypothesis is true, yields p value < :0001. We conclude that SIDS rates are not the same for all birth weight categories. Here the observed proportions of SIDS cases decrease across the last 7 weight categories, from over 13 cases per 1000 in weight category 3 to under 1 case per 1000 births for weight category 9. (b) Since two of the estimated expected counts are smaller than one and four of the estimated expected counts are smaller than 5, the large sample chi-square approximation to the distributions of the Pearson and deviance statistics may not provide accurate p-values. One option would be to combine the results for the three smallest birth weight categories. This yields a Pearson statistic that also rejects the hypothesis of equal SIDS rates across all birth weight categories (X 2 = 60:28 with d.f.=5 and p value < :0001). q (c) The formula is p z:025 p(1n p) . Here, n = 2061 and p = 2=2061. This yields: lower limit=-0.00037, upper limit=0.00231. The lower limit is negative because the mean count is too small for the large sample normal approximation to the binomial distribution to give accurate results. (d) The 95 % exact condence interval is ( 0.00012, 0.00350). This would give a more reasonable condence interval. The interval in part (c) would have a coverage probability smaller than 0.95. 4 (e) If you condition on the total counts in the birth weight categories, the SIDS case counts, Yi for the i-the birth weight category, have independent binomial distributions with probability i of a SIDS case and sample size ni . From the large sample normal approximation, p5 p7 _ N 5 p5 (1 p5 ) 7 ; n5 Then, an approximate 95 % condence interval for 5 0 @p s 5 p7 1:96 p5 (1 p5 ) n5 + p7 (1 p7 ) ; p5 n7 7 + p7 (1 p7 ) n7 is s p7 + 1:96 p5 (1 p5 ) n5 1 + p7 (1 p7 ) A n7 = (0:00050; 0:00281) Since the condence interval does not include zero, there is an evidence that the incidence of SIDS is signicantly higher in Group 5 than in Group 7. 4. (a) The m.l.e. for the mean number of corn borers per location in the Poisson model is m ^= 16 1X (i n i=1 1)Yi = 3:1566: This estimate was computed without combining any categories. (b) Combining some categories to avoid small expected counts, as suggested in the statement of the problem, The Pearson statistic for testing the t of the i.i.d. Poisson model is X 2 = 26:578 with 7 d.f. and p-value = .0004. The Poisson model does not appear to be appropriate. (c) In this case, Y = 3:1566; and S 2 = 5:5899: The deviance statistic is nS 2 =Y = 293:9618 with 165 d.f. and p-value < :0001: This test shows that the variance of the numbers of accidents is larger than the mean. Hence, the i.i.d. Poisson model is not appropriate. (d) Maximum likelihood estimates of the expected counts for the Poisson and negative binomial models are shown in the following table. Maximum likelihood estimates of the parameters in the negative binomial ^ = 4:2716. probability function are ^ = 0:5751 and K Number of Accidents 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 or more Number of Drivers 15 32 26 29 22 19 9 8 3 1 0 0 1 0 0 1 Poisson Model 7.066 22.306 35.206 37.044 29.234 18.456 9.710 4.379 1.728 0.606 0.191 0.055 0.014 0.004 0.001 0.000 Neg. Binomial Model 15.619 28.353 31.757 28.212 21.795 15.322 10.061 6.274 3.756 2.176 1.227 0.677 0.366 0.195 0.102 0.107 (e) Combining some categories to avoid small expected counts, as suggested in the statement of the problem, the value of the Pearson goodness-of-t statistic is X 2 = 3:990 with 8 d.f. and p-value = 0.858. The negative binomial model is consistent with the observed data. 5 (f) Do not use the i.i.d. Poisson model to construct the condence interval because it was shown that the Poisson model is inappropriate. There are several ways, however, to obtain an approximate 95% condence interval for the mean number of accidents per bus driver in a two year period. i. You could use the central limit to show that Y has a limiting normal distribution. You must also show that S 2 is a consistent estimator for 2 = V ar(Yi ). Then an approximate 95% condence interval 1s Y (1:96) p ) S 2 =n (2:80; 3:52) This method does not assume any particular distribution for the counts, but it does assume that counts of the n drivers are independent and identically distributed random variables. ii. Assuming the negative binomial model is appropriate, V ar(Yi ) = K (1 )= 2 and V ar(Y ) = K (1 )=n 2 . Then an approximate 95% condence interval is s Y (1:96) ^ (1 ^ ) K n ^2 ) (2:80; 3:51) iii. You could use the Delta method to nd the limiting normal distribution for the m.l.e. of the mean m ^ nb ^) = = g (^ ; K ^ (1 ^ ) K ^ (4:2716)(1 0:5751) = 3:1566 = Y : 0:5751 = The computer output inverts the estimated Fisher information matrix to estimate the covariance ^ )0 as matrix of (^ ; K 0:004623843 0:07736258 ^ V = 0:077362578 1:35233094 Compute rst partial derivatives of g (; K ), G Then ^ G = = @g @g @ @K ^ K 2 ^ 1 = ^ ^ K 2 1 ! = ( 12:9176; 0:7390) and V ar(m^nb ) is estimated as G^ V^ G^ 0 = 0:03307: Using the large sample normal approximation to the distribution of the m.l.e., m ^ nb ; an approximate 95% condence interval for the mean number of corn borers in a location is p ) 3:156626 (1:96) 0:03307 (2:8002; 3:5130): This method is also based on the belief that the negative binomial model is appropriate. Note that the condence interval based on the incorrect Poisson model m ^ p (1:96) m=n ^ ) (2:8863; 3:4269) is too short because the Poisson model does not allow for enough dispersion in the counts. iv. You could use the delta method and large sample normality of m.l.e.'s to rst construct a condence interval for log(mnb ) = log(K ) + log(1 ) log( ) Then apply the exponential function to the endpoints of that interval to obtain an approximate 95% condence interval for mnb . When would this be better than approach (iii)? 5. See problem 8 on assignment 3. 6. There are only seven tables with the same row and column totals as the observed table. These tables can be distinguished by the Y11 value. The exact probabilities are presented in the following table. 6 Table Number 1 2 3 4 5 6 7 Y11 0 1 2 3 4 5 6 Exact Probability 0.010098 0.080784 0.239828 0.338580 0.239828 0.080784 0.010098 observed data Looking at the proportions of high blood pressure cases for the aspirin treatment group and the control group, only table 1 is less consistent with the null hypothesis than the observed table, in the direction of the alternative that aspirin is better at controlling blood pressure than the placebo. Consequently, the p-value, 0.239827+0.080784=0.0909, is the sum of the probabilities for the tables 1 and 2. These data are inconclusive, they do not allow us to reject the null hypothesis that the high blood pressure rates are the same for the aspirin group and the control group at the 0:05 level of signicance. 7. Since the objective is to show that the IFN-B treatment is better, we should test the null hypothesis that the IFN-B treatment has the same eect as the control treatment against a directional alternative where the IFN-B treatment is "better". Dierent denitions of what it means for the IFN-B treatment to be "better" result in dierent answers. The row totals in this table are xed by the randomization procedure that places 10 subjects in each treatment group. Under the null hypothesis that the IFN-B and control treatments are equally eective, the column totals are also xed for these particular subjects. There are 43 possible tables with these row and column totals. Each table is distinguished by the values of Y11 and Y12 , the rst two counts in the rst row of the table. The values of these two counts and the corresponding probability that each table occurs by random assignment of subjects to treatment groups are shown below. Table Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Y11 6 6 6 6 6 5 5 5 5 5 5 4 4 4 4 4 4 Y12 0 4 1 3 2 5 0 1 4 2 3 6 0 1 2 3 4 Pearson X2 14.67 12.00 10.50 9.17 8.67 9.17 13.33 7.83 5.33 4.67 3.83 8.67 14.67 7.83 3.33 1.17 1.33 7 Exact Probability 15/184756 70/184756 160/184756 336/184756 420/184756 336/184756 25/184756 600/184756 2520/184756 (observed table) 2800/184756 4200/184756 420/184756 15/184756 600/184756 6300/184756 16800/184756 15750/184756 Table Number 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 Y11 4 3 3 3 3 3 3 3 2 2 2 2 2 2 2 1 1 1 1 1 1 0 0 0 0 0 Pearson X2 Y12 5 7 6 5 4 3 2 1 3 4 5 6 7 8 2 5 4 8 7 6 3 6 5 7 4 8 3.83 10.50 4.67 1.17 0.00 1.17 4.67 10.50 3.83 1.33 1.67 3.33 7.83 14.67 8.67 4.67 5.33 13.33 7.83 4.67 9.17 8.67 9.17 10.50 12.00 14.67 Exact Probability 4200/184756 160/184756 2800/184756 16800/184756 28000/184756 16800/184756 2800/184756 160/184756 4200/184756 15750/184756 16800/184756 6300/184756 600/184756 15/184756 420/184756 2800/184756 2520/184756 25/184756 600/184756 2800/184756 336/184756 420/184756 336/184756 160/184756 70/184756 15/184756 Note that in this case the exact probabilities provide the same ordering of the tables the Pearson X 2 values. Table 9 is the observed table and the EXACT option in PROC FREQ in SAS computes a multi-sided p-value of 0.0642 by adding the probabilities for tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 19, 25, 30, 31, 32, 34, 35, 36, 38, 39, 40, 41, 42, 43. The sher.test( ) function in SPLUS yields the same p-value by summing the probabilities for all tables that occur with probability no larger than the probability of the observed table. This is not necessarily a good way to dene a critical region or dene a p-value. This set of tables includes many tables that have fewer treated patients that either improve or stay the same than in the observed table, so it may not provide an appropriate p-value with respect to the objective of this study. One reasonable criterion is that tables provide more evidence than the observed table that the IFN-B treatment is better if at least 5 of the treated patients show improvement and at least 9 of the treated patients either improve or stay the same (or no more than one of the treated patients becomes worse). This includes tables 2, 4, 6, 9 and the p-value is 0.01766. Another criterion is that dierence between the number of treated patients that improve and the number of treated patients that get worse should be at least 5 1 = 4. This results in a p-value of 0.0222. Many students failed to clearly describe the criterion used to identify the possible tables that were included in the evaluation of the p-value. q ^ ) 8. (a) Here it means that a 95 % condence interval is no wider than ^ :03. Hence, (1:96) ^ (1 < :03, for 1100 any value of ^ . (The maximum occurs at ^ = :50). (b) Use = :50 and nd n such that r ^ (1 ^ ) < :01: (1:96) n The solution is n 9604. 8 9. (a) 442 subjects in each group. (b) Using the worst case scenario, we need r :03 (1:96) (:5)(:5) n + (:5)(:5) n : The solution is n 2135 subjects in each group. Using the researchers' best guess at what the results will be, we need r :03 (1:96) (:15)(:85) n The solution is n 1277 subjects in each group. + (:22)(:78) n : 10. (a) Consider a multinomial distribution 0 @ Y11 Y12 Y22 1 A 0 0 M ult @n = 156; @ 11 12 22 11 AA Then, the null hypothesis that the conditional probability of a secondary infection, given a primary infection, is equal to the probability of a primary infection is written as H0 : 11 + 12 or H0 : 11 = 12 11 + 12 = 2 ; 12 = (1 ); 22 =1 : (b) Under the assumption in (a), the log-likelihood is given as `( ; Y) = log(n!) (log(Y11 !) + log(Y12 !) + log(Y22 !)) + Y11 log( 2 ) + Y12 log( (1 )) + Y22 log(1 ): (c) Solving the likelihood equation, @` @ 0= = 2Y11 + Y12 Y12 + Y22 1 the mle of is computed by 2Y11 + Y12 = 0:4940: 2Y11 + 2Y12 + Y22 Evaluating the negative second derivative using the observed data gives the observed information ^ = I (^ ) = @ 2 ` @ 2 =^ = 996:1423 Then, var(^ ^ ) = I (^ ) 1 = 0:0010: It is better to use the expected information, the negative of the expectation of the second derivative of the log-likelihood, instead of the observed information. This is computed as E 2Y11 + Y12 2 + Y12 + Y22 (1 )2 = n(1 (1 ) The inverse of this quantity gives a formula for the large sample variance of ^, var(^ ) (1 )=(n(1+ )). Evaluating at ^ = 0:494 and n = 156, we have var(^ ^ ) 0:00107 Then an approximate 95 ^ p 1:96 var(^ ^ ) ) p 0:494 (1:96) :00107 9 ) (0:430; 0:558): (d) The test results are as follows. The model in part (a) is not an appropriate model for the data. stat df G2 17.73787 1 X2 19.70606 1 p-value .0000+ .0000+ (e) This question was poorly stated. It should have asked about the probability of primary infection instead of . Under the general alternative, the probability of primary infection is estimated as ^11 + ^12 = (Y11 + Y12 )=n = 0:596. From the multinomial distribution, the standard error for this estimate is estimated as s(^ 11 + ^12 ) = p r var(^ ^ 11 + ^12 ) = ^11 (1 = 0:0393 ^11 ) n + ^12 (1 ^12 ) n 2 ^11 ^12 n Alternatively, using the fact Y11 + Y12 Bin(n, 11 + 12 ) gives the same standard error. r s(^ 11 + ^12 ) = (^11 + ^12 )(1 n ^11 ^12 ) = 0:0393 (f) Using the standard error in (e), a large sample approximate condence interval is 0:596 1:96 s(^11 + ^12 ) = (0:519; 0:673): This condence interval is expected to perform better than the one in (c), with respect to more nearly providing a coverage probability of 0.95, because the model in (a) does not appear to be appropriate for these data. 10