Statistics 305 CHAPTER 7 – SECTIONS 1, 2, AND 3 Formulae for inferences about population means, and relationships among population means, in multiple sample studies. Assume r > 2 populations and sample sizes n1, n2, …, nr from N ( µ i , σ 2 ) populations, (i = 1, 2, …, r). The sample means are y1 , y 2 , ..., y r and the pooled estimate of σ 2 is s 2p . I. SINGLE SAMPLE INFERENCES − Confidence Intervals for µ i . A. Ignoring the problem with multiple comparisons y i ± t n(1−−rα / 2) s p / ni (i = 1, 2, ..., r ) are 100 (1− α)% confidence intervals for µ i . B. Simultaneous confidence intervals y i ± k 2* s p / ni (i = 1, 2, ..., r ) have overall simultaneous confidence 100(1−α)%. ( k 2* is read from Table B.8A. JMP doesn’t support this method. It is called the PillaiRamachandran Method.) II. TWO SAMPLE INFERENCES − Confidence Intervals for µ i − µ j . A. Ignoring the problem of multiple comparisons y i − y j ± t n(1−−rα / 2) s p 1 1 + ni n j (i, j = 1, 2, ..., r ) (JMP output − “comparison of each pair using student’s t”). B. Simultaneous confidence intervals yi − y j ± q* 2 sp 1 1 + ni nj (i, j = 1, 2, ..., r ) (q* in Table B.9) (df = n − r) (JMP output − “comparison for all pairs using Tukey-Kramer HSD.) III. CONFIDENCE INTERVAL AND SIGNIFICANCE TEST FOR A LINEAR COMBINATION OF POPULATION MEANS. For user selected numbers c1 , c2 , ..., cr the interest is in L = c1 µ1 + c2 µ 2 + ... + cr µ r . The point estimate is Lˆ = c1 y1 + c 2 y 2 + ... + c r y r . A. Confidence interval for L. c12 c 22 c r2 (1−α / 2) ˆ + + ... + Li ± t n−r sp . n1 n2 nr B. Significance test of H 0 : L = # . Test statistic is: Lˆ − # T= sp c12 c 22 c2 + + ... + r n1 n2 nr . This is taken as a student’s t random variable having n − r degrees of freedom. (JMP doesn’t compute L̂ ’s directly.) NOTE: If all ci except cj are zero and cj = 1 then the formula reduces to the single sample formula I. Similarly if two of the c’s are −1 and 1, and all others zero, then the formula for confidence interval is II. If several confidence intervals 100(1− α 1 )%, 100(1− α 2 )%, …, 100(1− α k )% for L1, L2, …, Lk are made, the Bonferroni inequality gives a bound for overall (simultaneous) k confidence 100γ % where γ ≥ 1 − ∑ α i . i =1 IV. SIGNIFICANCE TESTS OF THE FORM H 0 : µ i − µ j = 0 VS H a : µ i − µ j ≠ 0 USING JMP OUTPUT. A. Ignoring the problem of multiple comparisons the output labeled “comparison for each pair using student’s t” gives relevant information. The matrix of values “ABS(DIF)−LSD” contains y i − y j − t n(1−−rα / 2) s p 1 1 . + ni n j LSD = t n(1−−rα / 2) s p 1 1 . + ni n j The LSD is defined as 2 If y i − y j − LSD > 0 then the p-value of the significance test is less than α. Otherwise it is ≥ α (and the 100(1− α)% confidence interval contains zero). Understanding JMP Student’s t Comparison of Pairs of Means Output Matrix. To perform the significance test H 0 : µi − µ j = 0 H a : µi − µ j ≠ 0 we compute the value yi − y j Tn− r = sp (n − r 1 1 + ni n j DF ) . Then the p-value of the test is 2(1 − CDF ( Tn−r ) ) for CDF of the student’s t distribution having n − r DF. Let t n(1−−rα / 2) be such that CDF (t n(1−−rα / 2) ) = 1 − α / 2 . If Tn− r > t n(1−−rα / 2) then yi − y j sp 1 1 + ni n j > t n(1−−rα / 2) or y i − y j > s p t n(1−−rα / 2) Define LSD = s p t n(1−−rα / 2) 1 1 . + ni n j 1 1 , then if + ni n j y i − y j > LSD hence y i − y j − LSD > 0 , the p-value of the test is ≤ α, otherwise it is not. 3 B. Simultaneous significance tests H 0 : µ i − µ j = 0 vs. H a : µ i − µ j ≠ 0. The output labeled “comparison of all pairs using Tukey-Kramer HSD” is relevant. The matrix of values “ABS(DIF)−LSD” contains yi − y j − q * s p 1 1 + ni n j (q* here is your textbooks’ tabled value ÷ 2 ) LSD = q * s p 1 1 + ni n j If y i − y j − LSD > 0, µ i − µ j ≠ 0 is indicated at level α, the probability of rejecting one or more true H0’s is ≤ α. 4