生統 96-1 HW#11&12 慈濟醫資系 2008/01/09 P.1 HW#11 Solution HW11 source: Principles of Biostatistics, 2nd ed., M. Pagano and K. Gauvreau, Duxbury, 2000 1. Exercise 12.8 (a), (b), (c), (d)。(p. 299) 2. Exercise 12.9 (a), (b), (c), (d), (e)。(p. 299) 12.8 a. To test H0 :μ1=μ2=μ3=μ4 , we must first calculate estimates of the within-groups and between-groups variances. sW2 = [(n1 – 1) s12 + (n2 – 1) s22 + (n3 – 1) s32 + (n4 – 1) s42]/(n – 4) = 1.75 Since x = (n1 x 1 + n2 x 2 + n3 x 3 + n4 x 4)/n = 5.58, the between-groups variability: sB2 = [n1( x 1 – x )2 + n2( x 2 – x )2 + n3( x 3 – x )2 + n4( x 4 – x )2]/(4 – 1) = 19.06. The test statistic is F = sB2/ sW2 = 19.06/1.75 = 10.89 ~ Fdf1 , df2 = F3, 1494, under H0. The p-value < 0.001, therefore, we reject the null hypothesis. b. We conclude that mean LDL cholesterol level is not the same for all four groups. c. In order to use the one-way ANOVA, the four populations must be at least approximately normally distributed, and their variances must all be the same. d. It is necessary to take an additional step in this analysis. We have concluded that the mean LDL levels are not the same for all four groups, but we cannot yet say which group means differ from which others. In order to do this, we need to use the Bonferroni method of multiple comparisons. (It would tell us that patients with intermittent claudication and those with minor asymptomatic disease have higher mean LDL cholesterol levels than patients with no disease.) 12.9 a. The average minutes of individual therapy per session is longest for the private not-for-profit centers and shortest for the for-profit centers; the average minutes of group therapy is longest for the for-profit centers and shortest for the public centers. b. Each individual therapy, the three confidence intervals all overlap. Therefore, there is nothing to suggest that the populations means are not identical. (This is not a formal test, however.) The same is true for group therapy. c. To test H0 :μ1=μ2=μ3 individual therapy for each type of center, we first calculate estimates of the within-groups and between-groups variances. sW2 = [(n1 – 1) s12 + (n2 – 1) s22 + (n3 – 1) s32]/(n – 3) = 135.4 Since x = (n1 x 1 + n2 x 2 + n3 x 3)/n = 53.88, the between-groups variability: sB2 = [n1( x 1 – x )2 + n2( x 2 – x )2 + n3( x 3 – x )2]/(3 – 1) = 515.8. The test statistic is F = sB2/ sW2 = 515.8/135.4 = 3.81 ~ Fdf1 , df2 = F2, 515, under H0. The 0.01 < p-value < 0.025, therefore, we reject H0 at 0.05 level and conclude that mean minutes of individual therapy per session are not the same for all three types of centers. 生統 96-1 HW#11&12 慈濟醫資系 2008/01/09 P.2 To apply the Bonferroni method of multiple comparisons, we conduct three pairwise tests at the 0.05/3 = 0.0167 level of significance. The test statistics are: t12 = [( x1 x2 ) (1 2 )]/ sW2 (1/ n1 ) (1/ n2 ) = [(49.46 54.76) 0]/ 135.4[(1/ 37) (1/ 312)] = -2.62 t13 = [(49.46 53.25) 0]/ 135.4[(1/ 37) (1/169)] = -1.79 t23 = [(54.76 53.25) 0]/ 135.4[(1/ 312) (1/169)] = 1.36 All test statistics have a t distribution with 515 df. The comparison of private for-profit and not-for-profit centers results in p = 0.008; the p-values for the other two comparisons are both greater than 0.0167. Therefore, we conclude that the mean minutes of individual therapy per session is shorter for for-profit centers than for not-for-profit centers. d. To test H0 :μ1=μ2=μ3 group therapy for each type of center, we again calculate estimates of the within-groups and between-groups variances. sW2 = [(n1 – 1) s12 + (n2 – 1) s22 + (n3 – 1) s32]/(n – 3) = 947.7 Since x = (n1 x 1 + n2 x 2 + n3 x 3)/n = 97.60, the between-groups variability: sB2 = [n1( x 1 – x )2 + n2( x 2 – x )2 + n3( x 3 – x )2]/(3 – 1) = 2159.2. The test statistic is F = sB2/ sW2 = 2159.2/947.7 = 2.28 ~ F2, 488, under H0. The p-value > 0.10, therefore, we cannot reject H0 at 0.05 level and conclude that mean minutes of group therapy per session are the same. e. While private for-profit centers have shorter individual therapy sessions than not-for-profit centers, on average, there are no significant differences in the length of group therapy sessions. 生統 96-1 HW#11&12 慈濟醫資系 2008/01/09 P.3 HW#12 HW12 source: Principles of Biostatistics, 2nd ed., M. Pagano and K. Gauvreau, Duxbury, 2000 1. Exercise 17.5 (a), (b), (c), (d), (e), (f), (g)。(p. 412) 2. Exercise 18.9 (b), (c), (d), (e)。(p. 444) 3. Exercise 18.11 (a), (b), (c), (d)。 (p. 445, B-10) 17.5 (a) 11.00 Ch o le s te ro l Le ve l 10.00 9.00 8.00 7.00 6.00 R S q Line a r = 0.422 5.00 2.00 4.00 6.00 8.00 10.00 12.00 14.00 16.00 Trig lyc e rid e Le ve l (b) There appears to be a slight tendency for cholesterol to increase as triglyceride increases. 1 n (c) Pearson’s coefficient of correlation r = ( xi x )( yi y ) sx s y n 1 i 1 = 34.90/(10 – 1)(1.563)(3.818) = 0.650. (d) Test statistic T = r (n 2) /(1 r 2 ) = .650 (10 2) /(1 .652 ) = 2.42 Under H0, the quantity T ~ t8, if X and Y are normally distributed. Since p = 0.042 < 0.05, we reject H0 and conclude that ρis not equal to 0. n (e) rs = 1 6 di2 n(n 2 1) =1 – 6(96)/10(100 – 1) = 0.418. i 1 (f) The rs is smaller in magnitude than r. However, it still suggests a moderate positive relationship between cholesterol and triglyceride levels. (g) Test statistic Ts = rs (n 2) /(1 rs2 ) = .418 (10 2) /(1 .4182 ) = 1.30, (for n ≥ 10) Under H0, the quantity Ts ~ t8. In this case, p = 0.229 > 0.05, we cannot reject H0. De scri ptive Statistics Choles terol Level Tri glyc eride Level Mean 6.7330 5.9090 St d. Deviat ion 1.56331 3.81842 N 10 10 生統 96-1 HW#11&12 慈濟醫資系 2008/01/09 P.4 Correlations Spearm an's rho Choles terol Level Tri glyceride Level Choles terol Level 1.000 . 10 .418 .229 10 Correlation Coefficient Sig. (2-tailed) N Correlation Coefficient Sig. (2-tailed) N Tri glyceride Level .418 .229 10 1.000 . 10 Correlations Choles terol Level Triglyceride Level Pearson Correlation Sig. (2-tailed) Sum of Squares and Cross-products Covariance N Pearson Correlation Sig. (2-tailed) Sum of Squares and Cross-products Covariance N Choles terol Level 1 Triglyceride Level .650* .042 21.995 34.902 2.444 10 .650* .042 3.878 10 1 34.902 131.223 3.878 10 14.580 10 *. Correlation is s ignificant at the 0.05 level (2-tailed). 18.9 (b) ( ̂ , ˆ ) = (10.552, 1.264) → a fitted line ŷ = ˆ ˆ x = 10.552 + 1.264 x . The slope implies that each one-week increase in gestational age causes an infant’s systolic blood pressure to increase by 1.264 mm Hg on average. Although it does not make sense in this case, the y-intercept of 10.662 is the predicted systolic blood pressure for an infant whose gestational age is 0. (c) Test for H0 :β= 0 vs H1 :β≠ 0 , (test for no linear relationship) test stat t = ˆ seˆ( ˆ ) = 1.264/0.436 = 2.899 ~ t98 , under H0, p = 0.005 < 0.05. We reject H0 at 0.05 level and conclude that for low birth weight infants, systolic blood pressure increases in magnitude as gestational age increases. (d) For x = 31, ŷ = 10.552 + 1.264 x = 10.552 + 1.264(31) = 49.736. (e) seˆ( yˆ ) = sy|x 1 ( x x )2 n ( xi x ) 2 = 11.0 1 4.4521 = 1.434, 100 635.694 for x = 31, x =28.89, ( x x )2 = 4.4521, (x x ) i 2 = 635.694. Construct a 95% C.I. for ŷ : ( ŷ –t98, .025× seˆ( yˆ ) , ŷ + t98, .025× seˆ( yˆ ) ) = 49.736 ± 1.98(1.434) = (46.91, 52.59). 生統 96-1 HW#11&12 慈濟醫資系 2008/01/09 P.5 Model Summaryb Model 1 R .281a R Square .079 Adjusted R Square .070 Std. Error of the Estimate 11.000 a. Predictors: (Constant), gestational age b. Dependent Variable: sys tolic blood press ure Coeffi cientsa Model 1 Unstandardized Coeffic ient s B St d. E rror 10.552 12.651 1.264 .436 (Const ant) gestat ional age St andardiz ed Coeffic ient s Beta t Sig. .406 .005 .834 2.898 .281 95% Confidenc e Interval for B Lower Bound Upper Bound -14.553 35.657 .399 2.130 a. Dependent Variable: sy stolic blood pres sure 18.11 (a) The slope implies that for each 1% increase in contraceptive practice, total fertility rate decreases by 0.062 births per woman. The intercept of 6.83 is the predicted total fertility rate for a country with 0% prevalence of contraceptive practice. (b) The two-way scatter plot is displayed below. It appears that total fertility rate decreases as contraceptive practice increases. 9.00 fe rtility ra te p e r wo m a n 8.00 7.00 6.00 R S q Line a r = 0.22 5.00 0.00 10.00 20.00 30.00 40.00 50.00 c o n tra c e p tio n u s e (% ) (c) (d) The total fertility rates for the African nations are higher than would be predicted based on the regression line. ŷ = ˆ ˆ x = 6.947 – 0.030 x Coeffi cientsa Model 1 (Const ant) contracept ion use (%) Unstandardized Coeffic ient s B St d. Error 6.947 .261 -.030 .014 a. Dependent Variable: fertility rat e per woman St andardiz ed Coeffic ient s Beta -.469 t 26.628 -2. 056 Sig. .000 .058