y21, . . . , y2n2 .. y11, . . . , y1n1 N (μ2, σ 2) iid iid ∼ N (μ1, σ 2) ∼ iid Stat 402B (Spring 2016): Note set 3 H 0 : τ 1 = · · · = τa vs Ha : at least one inequality 2 H0 : μ1 = · · · = μa vs Ha : at least one inequality or, equivalently yij = μ + τi + eij where τi are treatment effects expressed as deviations from a fixed value μ. The null hypothesis of interest is iid yij = μi + eij , where eij ∼ N (0, σ 2) or equivalently, by the effects model: ya1, . . . , yana ∼ N (μa, σ 2) This assumption can be represented by the model (the means model): Assume Last update: January 20, 2016 Stat402B (Spring 2016) Note set #3 • • yij j=1 ni , ni i = 1, . . . , a. s2E = = = i=1 a j=1 (yij −ȳi. ) N −a ni 2 (n1 −1)s21 +(n2 −1)s22 +···+(na −1)s2a (n1 −1)+(n2 −1)+···+(na −1) i=1 (y1i n1 Treatment a ya1 ya2 .. yana ȳa. 1 3 Stat 402B (Spring 2016): Note set 3 − ȳ1.)2 , estimates σ 2 n1 − 1 n2 (y2i − ȳ2.)2 2 s2 = i=1 , estimates σ 2 n2 − 1 .. na (yai − ȳa.)2 s2a = i=1 , estimates σ 2 na − 1 Pool them to obtain one estimator of σ 2: s21 Variation within treatments where ȳi. = Treatment 1 Treatment 2 . . . y11 y21 ... y12 y22 ... .. .. y1n1 y2n2 ... Sample Means: ȳ1. ȳ2. ... Data. Let yij represent the j-th observation taken under treatment i for a i=1,...,a and j=1,...,ni. Let N= i=1 ni Example Stat 402B (Spring 2016): Note set 3 Comparing Several Treatments • • ni Stat 402B (Spring 2016): Note set 3 Source of Variation d.f. SS MS F p − value RF Power 3 66, 870.55 22, 290.18 66.8 < .0001 Error 16 5339.20 333.70 Total 19 72, 209.75 ANOVA Table (Table 3.4 6 is interested in a particular gas (hexafluroethane) and gap (.8 cm) and wants to test four levels of RF power:160, 180, 200, and 220 W. She decided to test five wafers at each level of RF power. This is an example of a single-factor experiments with a = 4 levels and n = 5 replicates. For this to be a completely randomized design, the 20 tests needs to be run in a random order i.e., the order in which the testing is done has to be determined randomly. Stat 402B (Spring 2016): Note set 3 4 Variation Between Treatments 2 From the above, we know that ȳi. ∼ N (μi, σni ), i = 1, 2, . . . , a and they are independent. a ni(ȳi. − ȳ..)2 2 sT rt = i=1 a−1 a ni a ni y ij 2 where ȳ.. = i=1 N j=1 . The denominator j=1 ni (ȳi. − ȳ.. ) i=1 is between treatment sum of squares with (a-1) d.f. and is denoted by SST rt. s2T rt is the between treatment mean squares. a ni 2 i=1 j=1 (yij − ȳ.. ) 2 sT ot = N −1 2 i=1 j=1 (yij − ȳi. ) is called the within treatment sum of squares with (N-a) d.f. and is denoted by SSE . s2E is called the within treatment mean square. a • • Stat 402B (Spring 2016): Note set 3 that gives the anova table Stat 402B (Spring 2016): Note set 3 We cannot estimate τi’s uniquely, but can estimate the difference in a pair of τi’s. Estimate of τp − τq is ȳp. − ȳq. σ̂ 2 = s2E = M SE Values predicted by the model for yij are: yij = μ̂i = ȳi for each observation. • • • 7 Estimate of the difference μp − μq for any p = q is ȳp. − ȳq. Best estimates of μ1, . . . , μa are ȳ1., . . . , ȳ2., . . . , ȳa respectively. i.e. μ̂i = ȳi, i=1,...,a. where eij ∼ N (0, σ 2). yij = μi + eij or yij = μ + τi + eij Estimation and Prediction 5 An engineer is interested in investigating the relationship between the RF power setting and the etch rate for a wafer plasma-etching tool. She Example 3.1: Plasma Etching Experiment Source of Variation d.f. SS MS F Bet. Trt a − 1 SST rt M ST rt M ST rt/M SE Within Trt N − a SSE M SE Total N − 1 SST rt We reject H0 at α level of significance if F > Fα,a−1,N −a SST ot = SST rt + SSE • • ni i=1 j=1 (yij − ȳ.. ) is called the total corrected sum of squares with (N-1) degrees of freedom and is denoted by SST ot. We have the partitioning a H0 : μp = μq vs Ha : μp = μq (ȳp. − ȳq.) ± tα/2,N −a · sE A 100(1 − α)% CI for μp − μq is given by Confidence Intervals (CIs) for Differences 1 1 + np nq 10 Reject H0 if t0 > tα/2,N −a , ai.e., declare μp and μq different at α level of significance. Where N = i=1 ni, s2E = M SE and tα,N −a is the upper 100( α2 )% point of the t-distribution with (N-a) d.f. Pairwise t-tests Residuals 180 200 −22.4 −25.4 5.6 25.6 2.6 −15.4 −8.4 11.6 22.6 3.6 220 18.0 −7.0 8.0 −22.0 3.0 1 1 + np nq 11 LSD Procedure (can be used this way only when sample sizes are equal). The quantity on the right side hand is called the least significance difference or the LSD. For the balanced case, i.e., n1 = · · · = na = n 2 LSDα = tα/2,N −a · sE n |ȳp. − ȳq.| > tα/2,N −a · sE Declare μp, μq significantly different at α level if Least Significance Difference(LSD) If this interval does not include zero, we say μp and μq are different at α level of significance. Stat 402B (Spring 2016): Note set 3 160 23.8 −9.2 −21.2 −12.2 18.8 220 707.00 707.00 707.00 707.00 707.00 Stat 402B (Spring 2016): Note set 3 Power(W) 160 551.20 551.20 551.20 551.20 551.20 9 Residuals yij − ŷij prediced values ŷij Power(W) Stat 402B (Spring 2016): Note set 3 Predicted Values 180 200 587.40 625.40 587.40 625.40 587.40 625.40 587.40 625.40 587.40 625.40 8 Pairwise Comparison of Means (or Effects) yi. ni ȳi 220 725 700 715 685 710 3535 5 707.0 Data yij • Etch Rate 180 200 565 600 593 651 590 610 579 637 610 629 2937 3127 5 5 587.4 625.4 Residuals are entirely model independent. If the model is adequate, the residuals should contain no obvious patterns. • 160 575 542 530 539 570 2756 5 551.2 SSE is also called the residuals sum of squares • Power(W) Residuals rij = yij − ȳi. • Stat 402B (Spring 2016): Note set 3 Stat 402B (Spring 2016): Note set 3 Tukey’s Method 15 14 1 1 + np nq Using this value we still find differences of all pairs of means significantly different. sE Tukeyα = qα,a,f · √ n = M SE . Calculate and use this value the same way as the LSD i.e., declare a difference of means, ȳp. − ȳq. significant if |ȳp. − ȳq.| > Tukeyα If this procedure is used when making all possible pairwise tests of differences, the total type I error rate is controlled at α. Example 3.7: As an illustration of this procedure, compute LSD at .05 α for the Plasma Etching exampleFor the Etch-rate example, sE 333.7 . Tukey.05 = q.05,4,16 · √ = 4.05 = 33.09 5 5 freedom associated with Stat 402B (Spring 2016): Note set 3 Stat 402B (Spring 2016): Note set 3 s2E 13 where qα,a,f is the upper significant level of the studentized range (see Table VII). a is the number of means compared and f is the degrees of √ (ȳp. − ȳq.) ∓ (qα,a,f / 2) · sE A 100(1 − α)% CI for μp − μq is given by In any case, make sure that the ANOVA F-statistic is significant at α level before we use the LSD procedure to ensure that the type I error rate is somewhat controlled, when making all pairwise comparisons. • LSD procedures often leads to conflicting results. The problem with this method is it turns-out to be too conservative i.e.: one may fail to find a few pairs of means that are actually different to be significant. • • One way to alleviate this problem is to use the Bonferroni adjustment: Replace α with α/k where k is the number of comparisons being made. In the case when a means are being compared, k = (a)(a − 1)/2. So for computing the LSD, one would use a t-value=tα/2k . • Stat 402B (Spring 2016): Note set 3 When we do all pairwise tests at α level, the actual type I error rate turns out to be much greater than α. • 12 The type I error rate α holds only for testing only a single hypotheis, not testing all possible differences among the means. When comparing a means, suppose we wish to state a confidence interval for μp − μq taking into account the fact that all possible pairwise differences may be examined. A confidence interval that gives a confidence of 100(1 − α)% or more when making all pairwise comparisons is given by Tukey’s Studentized range statistic. • Important Notes: This implies that any pair of means will be found to be significantly different if the difference in sample means of the pair exceeds 24.49. We may use the underscoring procedure make comparing every pair simpler. In this case, this method is unnnecessary as all pairs of means are found to be significantly different. (See the analysis of Exercise 3.10 below for an illustration of the underscoring procedure.) Example 3.8: As an illustration of this procedure, compute LSD at α = .05 for the Plasma Etching example 2s2E 2 2(333.7) LSD.05 = t.025,16 · sE = t.025,16 = 2.12 = 24.49 n n 5 Stat 402B (Spring 2016): Note set 3 % 20 12 17 12 18 18 77 5 15.4 of Cotton 25 30 14 19 19 25 19 22 18 19 18 23 88 108 5 5 17.6 21.6 30% 21.6 Mean strength of 15% and 35% cotton fiber are not different from each other but significantly less stronger that the 20%, 25%, and 30% cotton fiber. • 18 Mean strength of 20% and 25% cotton fiber are not different from each other but significantly less stronger that the 30% cotton fiber. • The following conclusions may be made: • Mean strength of 30% cotton fiber is significantly different from all other fiber means and is the strongest. The underscoring procedure gives: Cotton 15% 35% 20% 25% Means 9.8 10.8 15.4 17.6 ————– ————– 16 Stat 402B (Spring 2016): Note set 3 35 7 10 11 15 11 54 5 10.8 Arrange the sample means in increasing order of magnitude: ȳ1. ȳ5. ȳ2. ȳ3. ȳ4. yi. ni ȳi 15 7 7 15 11 9 49 5 9.8 Exercise 3.10 A product developer is investigating the tensile strength of a new synthetic fiber that will be used to make cloth for mens shirts. Strength is usually affected by the percentage of cotton used in the blend of materials for the fiber. The engineer conducts a completely randomized experiment with five levels of cotton content and replicated the experiment five times. The data are: Stat 402B (Spring 2016): Note set 3 LSD.05 = t.025,20 sE Compute LSD at α = .05: 2 = t.025 n 2s2E 2(8.06) = 2.086 = 3.75 n 5 17 Since the F-value is 14.76 with a corresponding p-value of < 0.0001, reject the hypothesis that the tensile strength means are all equal. The percentage of cotton in the fiber appears to have an effect on the tensile strength. ANOVA Table Source of Variation d.f. SS MS F p − value Percentage 4 475.76 118.94 14.76 < .0001 Error 20 161.20 8.06 Total 24 636.96