Analysis of Variance MA 223 Problem When the nose of the bar of a running chainsaw makes contact with something, especially if the saw is running at slow speeds, the chain will bite into the object and the resulting reaction kicks the bar (with the still spinning chain) upward toward the user, quickly and violently. This phenomena is called kickback. It’s bad. Thus a lot of effort goes into designing saw chain and bars that have low kickback, without compromising other desirable traits (like cutting speed). Four prototype chains have been designed and are to be evaluated for kickback. Each prototype is mounted on a chainsaw which is then mounted on a “kickback” machine, which spins the chain at high speed and then runs a wooden sample in the nose of the bar on which the chain is mounted. This causes the saw (which is free to pivot) to kick upward. A system of pulleys and weights measure the forces involved, and this data is processed to produce a “kickback” angle, basically a rating of the kickback potential of that type of chain (mounted on that type of bar/saw combination). The data for 5 test runs of each of the prototypes is below: Rep. 1 2 3 4 5 Prototype 1 45.2 46.8 47.1 46.9 44.4 Prototype 2 50.1 52.2 49.9 54.4 48.1 Prototype 3 44.9 42.1 40.8 40.7 42.1 Prototype 4 41.2 44.0 45.1 41.0 43.8 The data are angles, in degrees. Lower is better. Is there any reason to believe that any of the chains has significantly different kickback potential than any other chain? The Mathematical Model and Assumptions This is an example of a one-way analysis of variance. It’s a “one-way” because we have a single factor of interest, chain type. This factor appears at four levels (there are four prototypes). Suppose we have a one-way ANOVA (ANOVA = analysis of variance) with a single factor at k different levels (so above k = 4). The different levels are called treatments, so we have four treatments in the case above. Let µi be the “true” value of treatment i. Define the overall mean to be µ= k 1∑ µi . k i=1 (1) In the case above, this is the “true” average kickback angle for these four chain prototypes. Define treatment effect τi = µi − µ; this is the deviation of the ith treatment from the overall mean. Note that µi = µ + τi . Note also that k ∑ (2) τi = 0. (3) i=1 To help understand the setup here, try to imagine if the kickback test were PERFECTLY repeatable, with no noise. In this case the data might look like Rep. 1 2 3 4 5 Prototype 1 45.2 45.2 45.2 45.2 45.2 Prototype 2 50.1 50.1 50.1 50.1 50.1 1 Prototype 3 44.9 44.9 44.9 44.9 44.9 Prototype 4 41.2 41.2 41.2 41.2 41.2 The overall mean would be µ = 14 (45.2 + 50.1 + 44.9 + 41.2) = 45.35, with treatment effects τ1 = −0.15, τ2 = 4.75, τ3 = −0.45, τ4 = −4.15. With no noise, we can of course easily see that the chains differ and which chain has the lowest kickback. Now let xij denote the result of performing the experiment with treatment i, repetition j. If there’s no noise we simply obtain xij = µ + τi . But in reality there IS noise, and our model for the noise (and the whole experiment) is that xij = µ + τi + ϵij (4) where the ϵij are independent samples of a normal random variable with zero mean and variance σ 2 . We assume that all the ϵij are drawn from such a distribution, and are independent samples. In this model i runs from 1 to a (the number of treatments) and we assume that each treatment is measured m times—there are m replications. Equation (4) is our fundamental model for a one-way ANOVA with k treatments and m replications. In fact the number of replications don’t have to be the same for each treatment, but we’ll assume this holds for simplicity. The Hypothesis Our null hypothesis will be H0 : τ1 = τ2 = · · · = τk = 0 versus alternative H1 : τi ̸= 0 for at least one i. We could also write H0 as H0 : µ1 = µ2 = · · · = µk . An Algebraic Identity First, some notation. Given any numbers xij (not necessarily of the form specified in equation (4)) with 1 ≤ i ≤ k, 1 ≤ j ≤ m, define xi· = m ∑ ∑∑ 1 1 xi· , x·· = x·· xij , x̄·· = m km i=1 j=1 k xij , x̄i· = j=1 m Basically, anywhere a dot appears, we sum with respect to the relevant index. If there’s a bar over the symbol, we average with respect to the index. Also, let’s define the “total sum of squares” for the experiment, SST = k ∑ m ∑ (xij − x̄·· )2 . (5) i=1 j=1 Notice that if you wanted to compute the variance of the set of all measurements xij , you’d compute SST and then divide by the number of measurements minus 1. The fundamental algebraic identity around which ANOVA revolves is k ∑ m ∑ i=1 j=1 | (xij − x̄·· )2 = m {z SST } | k ∑ m k ∑ ∑ (xij − x̄i· )2 . (x̄i· − x̄·· )2 + i=1 {z SST reatments } i=1 j=1 | {z (6) } SSE The left side is just SST . This last equation isn’t too hard to prove—it’s just messy algebra. It should be emphasized that equation (6) works for ANY set of numbers xij . The first sum on the right is called the treatment sum of squares, and is written SST reatments . The second term on the right in (6) is called the error sum of squares or residual sum of squares and is written SSE . Thus equation (6) can also be written as SST = SST reatments + SSE . (7) 2 This emphasizes the basic idea of ANOVA: the total variation in the experiment can be attributed to two sources, the variability of the treatments and the “random” variability of measurements. It’s worth considering the case in which the xij are of the form in equation (4) but with no noise or error, just to build some intuition. Thus let xij = µ + τi where the τi sum to zero—this is the noiseless data case from above. In this case we have x·· = kmµ, x̄·· = µ, xi· = m(µ + τi ), and x̄i· = µ + τi . Equation (6) looks like k ∑ m m k k ∑ ∑ ∑ ∑ τi2 = m τi2 + 02 i=1 j=1 i=1 i=1 j=1 ∑k which you can easily check is true. In the above case SST reatments is just m i=1 τi2 ; in this case SST reatments is non-zero exactly when any of the τi ̸= 0. Because there is no noise, SSE is exactly zero. The Analysis of Variance But now suppose that we have data which is noisy, following the model we developed above in equation (4). We obtain x̄·· = µ + ϵ̄·· , x̄i· = µ + τi + ϵ̄i· , and (6) becomes SST = m | k k ∑ m ∑ ∑ (τi + ϵ̄i· )2 + (ϵij − ϵ̄i· )2 . i=1 {z } SST reatments i=1 j=1 | {z (8) } SSE The first term on the right is SST reatments and the second term is SSE , written out explicitly in terms of the model. Both SST reatments and SSE are random variables. The expected value of SST reatments turns out to be E(SST reatment ) = (k − 1)σ 2 + m k ∑ τi2 . (9) i=1 which is easy to prove if you expand out SST reatments . We also have E(SSE ) = k(m − 1)σ 2 . (10) Now define the mean square for treatments as M ST reatments = SST reatments k−1 and the mean square error as M SE = SSE . k(m − 1) You can immediately see that m ∑ 2 τ k − 1 i=1 i k E(M ST reatments ) = σ 2 + (11) and E(M SE ) = σ 2 . (12) The Statistic Notice that if H0 holds then both E(M ST reatments ) and E(M SE ) are both equal to σ 2 . Thus if H0 holds we would expect the ratio M ST reatments /M SE to be close to 1. But if H0 isn’t true then M ST reatments /M SE will tend to be larger than one. In fact, it turns out that IF H0 is true then F = M ST reatments M SE 3 follows an f-distribution with k(m − 1) degrees of freedom in the denominator and k − 1 degrees of freedom in the numerator. This is the key fact we use to test the hypothesis H0 . We compute both M ST reatment and M SE and compute their ratio, F = M ST reatment /M SE . If H0 holds, then F follows an f distribution. We compute the p-value for the value of F obtained and reject or don’t reject H0 at the appropriate significance level. Follow Up If we do end up rejecting H0 (not all treatments are the same) then we should perform some analysis to determine what effect the treatments have, or to compare two treatments. The quantity x̄i· − (µ + τi ) t= √ M SE /m follows a t-distribution with k(m − 1) d.f. You can use this to put a confidence interval on τi . Similarly the quantity x̄i· − x̄j· − (τi − τj ) √ t= 2M SE /m follows a t-distribution with k(m − 1) d.f., which you can use to put confidence interval on τi − τj . Section 9.3 in the text also outlines “Tukey’s Procedure” a methodical approach to comparing the treatment means when the hypothesis H0 of equal means has been rejected. Checking the Assumptions Running an ANOVA requires that the core assumptions of IID normal residuals with equal variance be met. These assumptions should be checked. Assessing the IID assumption probably requires some knowledge of how the data was collected, in particular, the order in which the data was collected. To check equality of variances and normality note that the (estimated) residuals are given by êij = xij − x̄i,· If we define 1 ∑ 2 ê n − 1 j=1 ij m s2i = then the statistics s21 , . . . , s2k are estimates of the variance σi2 for each treatment, and so can be used to test the equality of variances hypothesis σ12 = · · · = σk2 . There are several ways to test this hypothesis, including “Levene’s test”, built into Minitab. If the equality of variances test is passed then one can perform a normality test on the residuals êij . Problems 1. Enter the kickback data above into Minitab. Run a one-way ANOVA. Pay attention to various sums of squares. Look at the F statistics and associated p-value. Would you accept or reject H0 : τ1 = τ2 = τ3 = τ4 = 0? If you reject H0 , compute 95 percent confidence intervals on the τi using the M SE as a pooled estimate of the variance. 2. The compressive strength of concrete is being studied and four different mixing techniques are being investigated. The following data have been collected: Trial 1 2 3 4 Technique 1 3129 3000 2865 2890 Technique 2 3200 3300 2975 3150 Technique 3 2800 2900 2985 3050 Technique 4 2600 2700 2600 2765 (a) Test the hypothesis that mixing techniques affect the strength of the concrete. Use α = 0.05, and compute the p-value of the F statistic. 4 (b) Put 95 percent confidence interval on each treatment effect. 5