One-Way Analysis of Variance (One-Way ANOVA) Notes Experiments versus Studies Types of Experiments Assumptions & Assumption Checks Types of Analysis NCSS When finished with the One-Way Anova notes, you should be able to o Define the ANOVA terms o Explain why the test of equal means is conducted as it is o Determine when it should be used o Explain the requirements of the procedure o Determine if the requirements are met o Test a hypothesis of equal means o Estimate the difference in sample means o How to get NCSS to calculate the values. 1. Experiments versus Studies 1.1 Terminology Response variable, Y, – a quantitative variable that may depend on the values of a qualitative variable (factor) Example: a student's grade may depend on several factors; e.g., the student's major and the the numbers hours studied. Note the last factor is actually quantitative but can be broken up into categories (< 3 hours, 3 to 5 hours, and > 5 hours for example) in order to use Analysis of Variance. Treatments – Example: specific combinations of hours studied and teaching method 1.2 Data Collection: Observational Study – Data is collected 1 of 2 ways: o a random sample is drawn from each population defined by the levels of the factor(s) and the response variable is measured o a random sample is taken from a population and for each observations, the response variable and the level of the factor(s) are measured Since you cannot control the values of the factor but just measure them, no cause-effect relationship can be established. Factorial Experiment – Data is collected : o A random sample of experimental units are selected from the population o Each experimental unit is randomly assigned to one of the treatment levels. o After the experiment, the response variable is measured Since the experimental units are assumed to be similar before the experiment, if the average value of the response variable now differs among the treatment levels after the experiment, you can assume a causeeffect relationship. Note: Care has to be taken in the experimental protocol to make sure that extraneous factors are not affecting the responses. For example see o http://en.wikipedia.org/wiki/Protocol_(natural_sciences) o http://www.dummies.com/how-to/content/designing-experimentsusing-the-scientific-method.html o http://www2.hawaii.edu/~halina/ExpDesWk/ResSteps.pdf Balanced Designs o Balanced – same number of observations assigned or sampled from each treatment 2. Types of Experimental Designs 2.1 One or multiple Factor Randomized Design 2.2 Randomized Block Design – Randomization occurs only within subsets (blocks) of experimental units. 2.3 Examples Teaching Method only : Students (the experimental unit) are randomly assigned to three teaching methods and their grade on a test (the response variable) is measured at the end of the experiment Teaching method and hours studied: Students (the experimental unit) are randomly assigned to one of 9 combinations of three teaching methods and three levels of studying. Their grade on a test (the response variable) is measured at the end of the experiment Teaching method within major: Students are first grouped according to their major (block) and then randomly assigned to each teaching method within each major. A test grade is measured at the end of the experiment. 3. Assumptions & Assumption Checks 3.1 Assumptions Same Variance: The population variance of the response variable is assumed to be the same in each population defined by the levels of the treatment. Independence: Experimental units are randomly assigned to (or in the case of an observational study, randomly sampled from populations defined by) each treatment level Normality: The population distribution of the values of the response variable is assumed to be normal in each population defined by the levels of the treatment. 3.2 Example: You are conducting an experiment to see if different teaching methods have differing average exam scores. Same Variance: The population variance of the exam grades is assumed to be the same in each teaching method. Independence: Students are randomly assigned to (or in the case of an observational study, students are randomly sampled from populations of students taking) each teaching method Normality: The population distribution of the values of the response variable is assumed to be normal in each population defined by the levels of the treatment. 3.2 Assumption checks – from SAS Modified Levine – comparing differences to center Box Plots 4. Analysis Understanding why variances are being used Test Assumptions Determine Sources of Variability and degrees of freedom Test equality of means Estimate difference in averages 4.1 Understanding why variances are being used Sample means vary. Even if the sample means come from the same population they will vary just due to randomness. If the sample means came from the same population (equal population means), variation in the data can help estimate the variation of the sample means. Compare variation of sample means to what it should have been if the population means had all been equal (an F ratio comparing variances). If ratio is large enough to be unlikely to have occurred by randomness, sample means vary enough to say that the population means are not all equal. (Fifth Building Block of the course) Notes: A (weighted) variation of sample means is called “between treatments” variation. The between treatmens variation is compared to the variation of the data or the "within treatments" variation. 4.2 To test assumptions, use NCSS which includes the necessary tests in the One-Way Anova procedure. It will tell you the result of the test of the assumption. 4.3 Sources of variability and degrees of freedom Total: Variability of data values around overall average ignoring what sample they came from: divisor of (n-1) MST, Mean square for treatments: weighted variability in sample averages. It has a divisor of (c – 1) where c is the number of averages, one for each treatment level. MSE, Mean square for error: Variability due to randomness. It has a divisor of (n - c) 4.4 Testing equal means 4.4.1 Steps in testing equal means: Null hypothesis: average value of Y is the same for all levels of the factor Alternative: at least two are different Test Statistic: Compares weighted variation of treatment averages to variation of random data. This ratio is called the F-test Rejection Region: If the F-ratio is large (above an F-table value then reject equal population means) The F table has two degrees of freedom: numerator degrees of freedom, c-1, and divisor degrees of freedom, n-c. The F table is table 6(a) found on page B-12 and B-13 in the text. Conclusion: We can (not) say the average value of Y differs for at least two levels of the factor. 4.4.2 Example: Y = tensile strength of a product Factor = 4 Suppliers (c=4) Obtain samples of size 5 from each supplier (n = 20) MST = 21.095 , and MSE = 6.094 Null hypothesis: 1=2=3=4 (average tensile strength is the same for all suppliers) Alternative: at least two are different Rejection Region: If the variation of the samples was large enough (F > F table) reject the null o Numerator degrees of freedom is the degrees of freedom found from finding the variability of four values: 4-1=3 o Denominator degrees of freedom comes from adding the degrees of freedom of each sample = (5-1)+(5-1)+(5-1)+(5-1) = 16 or n-c = 20 - 4 = 16 o F-Table = 3.24 o If the variation of your sample means is more than 3.24 times what it should have been (if the population means were equal then reject Ho Test Statistic: MST/MSE = 21.095/6.094 = 3.46 (The variation in the sample means was 3.46 times what you would expect under equal population means.) Decision: Since the sample means vary too much, you can conclude that not all population means were the same Conclusion: We can say that the average tensile strength differs for at least two suppliers Shortened version of answer H0: 1=2=3=4 H1: at least two are different Rejection Region: If F > F(3,16) = 3.24 then reject Ho Test Statistic: MST/MSE = 21.095/6.094 = 3.46 Decision: Reject Ho Conclusion: We can say that the average tensile strength differs for at least two suppliers For other examples go to http://wweb.uta.edu/faculty/eakin/busa5325/OneWayAnova.xls Use F9 to generate new examples. Exercise on Blackboard. Due date is the day before the next exam. 4.5 Estimating Differences in Population Means 4.5.1 Estimate –the difference in the sample means. By the Second Building Block of the course we know that this difference tends to be in error and does not equal to the difference in population means. By the Fourth Building Block we know that the largest error in the difference we would expect with a specified probability is called the _________________________ which is found by multiplying ___________ and ___________. 4.5.2 Standard Error : The Tukey's procedure uses MSE 1 1 as a measure of 2 n i n j' sampling error caused by variability and randomness. The symbol ni is the number of observations in the first sample mean and nj is the number of observations in the second sample mean. (The actual standard error of a difference in two sample means is the above formula times the square root of 2). Note if all the sample sizes are the same, a balanced design, then the measure of the standard error is MSE where ng is the common sample size. ng 4.5.3 Table Value – Use the Tukey table (Table 7(a), page B-20) uses two degrees of freedom. The columns corresponds to the number of means and the row corresponds to the d.f. of the denominator of the F test. 4.5.4 Example: You wish to estimate the difference in average tensile strength of a product between each pair of 4 suppliers. You have measured the tensile strength of five products from each supplier and found that MSE = 21.095 and MSE = 6.094. You have also calculated the sample means: Supplier 1 = 19.52, Supplier 2 = 24.26, Supplier 3 = 22.84, and Supplier 4 = 21.16. What are the confidence intervals for the difference in the population means? Solution: Step 1: Find the Tukey's measure of sampling error. Since the sample size is the same for each supplier, this will need to be calculated only once: MSE 1 1 6.094 1 1 6.094 1.103993 2 n j n j' 2 5 5 5 Step 2: Find the Tukey table value. Since there are 4 suppliers and the error degrees of freedom is 20-4 = 16, the table value is 4.05. Step 3: Calculate the margin of error. The margin of error called the critical range in the text is the table value times the measure of sampling error = 4.05*1.103993 = 4.4712. Step 4. Calculate the differences in each pair of sample means. Supplier 2 compared to Supplier 1 = 24.26 – 19.52 = 4.74 Supplier 3 compared to Supplier 1 = 22.84 – 19.52 = 3.32 … Supplier 3 compared to Supplier 4 = 22.84 – 21.16 = 1.68 Step 5. Interpret the interval: e.g., We are 95% confident that the largest error expected in any of the above differences is ±4.4712. Example of two of the comparisons You could also say that on average supplier 2 is from 0.27 to 9.21 higher in tensile strength than supplier 1. (4.74-4.47 = 0.27 to 4.74+4.47 = 9.21) You could also say that on average supplier 3 is from 2.79 lower to 6.15 higher in tensile strength than supplier 4. (1.68-4.47 = -2.79 to 1.68+4.47 = 6.15) For other examples (equal sample size only) go to http://wweb.uta.edu/faculty/eakin/busa5325/TukeyOneWayAnova.xls Use F9 to generate new examples Exercise on Blackboard. Due date is the day before the next exam. 5. SAS 5.1 Data format: place all the values of Y in one column and let the next column indicate which level of the factor is associated with that Y value. 5.2 Approach – see class web site 6. Use in Business Six-Sigma http://ezinearticles.com/?Design-of-Experiments-for-Six-Sigma&id=101212 7. Review Questions 1. What testing the equality of 5 population means, would be wrong with using an F-test value = 0.935? 2. If you find the F-test value to be smaller than the F-table value, you can conclude … 3. After determining that at least one of the population means differ from the others, what would be obvious next question? 4. Suppose you find that the variability of the sample means is 400 times what you expected the variation it should have been if the population means had been equal. Which of the equations below represent this? 5. You have a random sample of 8 objects from each of 5 populations. What table value would you use to show that at least two population means differ? 6. Which of the following allows you to make cause-effect statements? 7. You have three random samples of 4 each. If you have an ANOVA test statistic value of 10.34, would you reject equal means? Why or why not? 8. Suppose you are making a new absorbent paper towel and wish to compare it to the two leading brands. You sample 15 sheets of paper from each of the towels and measure how much water each can absorb. In this case what would the meaning of the equal variance assumption of analysis of variance? 9. In analysis of variance, why do we assume the variation of the population data is the same for all groups? 10. The status quo is that the average GPA is the same across colleges. You decide to test this using analysis of variance. What would be the interpretation of the power of the test? 11. If you determined that differences exist between population means, you would follow this with 12. If you have 4 random samples each of size 25, the Tukey’s table would use ___ and ____ degrees of freedom 13. You are measuring the difference in average salaries between different majors. If the difference between two sample means was $4 and the Tukey’s margin of error was ± $100,000. You would conclude that …