College of Arts and Sciences Two-way Analysis of Variance (ANOVA) Lesson 4 I. INTRODUCTION You studied how to compare three or more means and apply the pos hoc test in the previous section. When using a one-way ANOVA, just one independent variable is used, such as comparing how differently the patient's cholesterol is improved by the means of the four different diet kinds. The independent variable (IV) is the type of diet and the dependent variable (DV) is the cholesterol level. Adding another variable such as two types of medication leads to two independent variables, the types of diet and the types of medication. Hence, comparing means involving two independent variables is called two-way ANOVA. In addition, it will also give you an idea of how to interpret the findings based on the statistical result using the Minitab software (Minitab 17) and draw contextualized conclusions. II. OBJECTIVES At the end of this lesson, you (students) are expected to: a. Differentiate one-way and two-way ANOVA; b. apply appropriately two-way ANOVA using statistical software; c. write findings and conclusions based on the statistical results of two-way ANOVA and pos hoc tests. III. LESSON PROPER There are several types of Analysis of Variance (ANOVA) namely One-way ANOVA, two-way ANOVA, etc. They vary depending on the number of independent variables they have. However, the most common ANOVA used are the One-way and Two-way ANOVA. There are assumptions that need to be satisfied before using two-way ANOVA, the following are as follows: Assumptions of F-test According to Bluman (2009) 1. The populations from which the samples were obtained must be normally or approximately normally distributed. 2. The samples must be independent of one another. 3. The variances of the populations must be equal. 4. Interval data on the dependent variable. 5. The groups must be equal in sample size. What is the difference between one-way ANOVA and two-way ANOVA? A one-way ANOVA has just one independent variable (IV), whereas a two-way ANOVA has two IVs. For instance, the researcher wants to know how the length of the patient's hospital stays is impacted by the four different post-operative treatment kinds (treatment 1, treatment 2, treatment 3). The kind of post-operative treatment is the IV and the length of hospital stay is the DV. However, if there are two 1|Pa ge 1 4 Anton A. Romero independent variables, two-way ANOVA is the appropriate statistical test to use. For instance, in addition to IV₁ (post-operative treatment), there is another IV₂ that includes several forms of medication consumption (Medicine 1, Medicine 2). Example 1. Investigate whether sleep and sex impact health score. Draw your findings and conclusions. Assume that all the assumptions are satisfied. Use α = .05. Solutions: In the given problem, there are two IVs (sex and number of sleep) and one DV (health score). Step 1: Input the data for sleep, sex and health score in different columns. Step 2: Click general linear model, click model, highlight the IVs, add, OK, OK. 2|Pa ge 1 4 Anton A. Romero Solutions: Step 1: Input the data for sleep, sex and health score in different columns. 1 and 2 stands for male and female, respectively. Step 2: Click general linear model, click model, highlight the IVs, add, OK, OK. Step 3: Draw findings from the ANOVA table. Since the p-value for sleep (0.003) is less than .05, there is a significant difference in the health score among the different sleeping time. Since the p-value for sleep (0.002) is less than .05, there is a significant difference in the health score between males and females. 3|Pa ge 1 4 Anton A. Romero Since the p-value for interaction (sleep*sex) (.012) is less than .05, there is a significant interaction between sleep and sex on the health score. Note: Pos hoc test must be executed on the variables resulting in significant results (sleep, sex, interaction). However, pos hoc test for interaction is not covered in the discussion. Step 4: Execute Pos hoc test for variables with significant results. Click comparisons, Tukey, double click sleep and sex, click results, tests and confidence intervals, OK. Step 5: Draw findings from the pos hoc results. Look at the first column in the pos hoc result, the difference of means for 7 – 6 is 5.50 (positive), which shows that the mean of 7-hour sleep is higher than the mean of 6 hours sleep. 4|Pa ge 1 4 Anton A. Romero The findings: Since the p-value (.009) is less than .05, there is a significant difference between the health scores of people who sleep 6 and 7 hours. Since the p-value (.004) is less than .05, there is a significant difference between the health scores of people who sleep 8 and 6 hours. Since the p-value (.898) is greater than .05, there is no significant difference between the health score of people who sleep 8 and 7 hours. Other findings: Since the p-value (.002) is less than .05, there is a significant difference between the male and female health score. Note: If the mean of male and female are available, there is no need to include the variable sex in the pos hoc test. Step 6: Draw your conclusion/s. The following are the possible conclusion and other perspectives (some are mandatory some are alternative conclusions) (1) Females are healthier than males. Males are more prone to sickness compared to females. (2) 7 and 8 hours of sleep is better than 6 hours of sleep. People who sleep at least 7 hours per day are healthier. (3) The number of sleep affects the health of males and females. Specifically, being a female is not an assurance that they are healthier compared to males or vice versa, it can be affected by the number of sleeps they had. (E.g. females can be unhealthy because they only slept for 6 hours). (4) Sex affects the health of people who sleep for 6, 7 and 8 hours. Specifically, sleeping 7 and 8 hours a day is not a guarantee, it can be affected by their sex (E.g. people who sleep 7 and 8 hours a day are not healthy because they are male). Note: There are at least four mandatory conclusions in this problem. Also, specific findings can be achieved if pos hoc test is done for the interaction. 5|Pa ge 1 4 Anton A. Romero Example 2. A psychiatrist studied the effects of three antidepressants on subjects in three different age groups. Each subject was rated on a scale of 0 to 100, with higher numbers indicating greater relief from depression. Assume that all the assumptions are satisfied. Use α = .05. The following table presents the results. Solutions: In the given problem, there are two IVs (types of drugs and age group) and one DV (relief of depression). Step 1: Input the data for types of data, age group and depression relief score in different columns. 6|Pa ge 1 4 Anton A. Romero Step 2: Click general linear model, transfer DV (depression relief) to responses and the IVs (types of drugs and age group) to factors, click model, highlight the IVs, add, OK, OK. Step 3: Draw findings from the ANOVA table. Analysis of Variance Source Types of Drugs Age Group Types of Drugs*Age Group Error Total DF 2 2 4 27 35 Adj SS 5386.50 1362.67 2.33 25.25 6776.75 Adj MS 2693.25 681.33 0.58 0.94 F-Value 2879.91 728.55 0.62 P-Value 0.000 0.000 0.650 Since the p-value for types of drugs (0.000) is less than .05, there is a significant difference in depression relief scores among the three antidepressant drugs. Since the p-value for the age group (0.000) is less than .05, there is a significant difference in depression relief scores among the age group. 7|Pa ge 1 4 Anton A. Romero Since the p-value for interaction (types of drugs*age group) (.650) is greater than .05, there is no significant interaction between types of drugs and age group on scores on depression relief. Note: Pos hoc test must be executed on the variables resulting in significant results (types of drugs and age group). Step 4: Execute Pos hoc test for variables with significant results. Click comparisons, Tukey, double click types of drugs and age group, click results, tests and confidence intervals, OK. Step 5: Draw findings from the pos hoc results. Look at the first column in the pos hoc result, the difference of means for drug B and drug A is 9.00 (positive), which shows that the mean of type B is higher than type A. The findings: Since the p-value (.000) is less than .05, there is a significant difference between the drug B and drug A depression relief scores. 8|Pa ge 1 4 Anton A. Romero Since the p-value (.000) is less than .05, there is a significant difference between the drug C and drug A depression relief scores. Since the p-value (.000) is less than .05, there is a significant difference between the drug C and drug B depression relief scores. Other findings: Since the p-value (.000) is less than .05, there is a significant difference between the depression relief scores of age groups 31-50 and 18-30. Since the p-value (.000) is less than .05, there is a significant difference between the depression relief scores of age groups 51-80 and 18-30. Since the p-value (.000) is less than .05, there is a significant difference between the depression relief scores of age groups 51-80 and 31-50. Step 6: Draw your conclusion/s. The following are the possible conclusion and other perspectives (some are mandatory some are alternative conclusions) (1) (2) Among the three different drugs, drug C is the most effective antidepressant medicine or drug A is the least effective antidepressant medicine. Drug B is more effective than drug A in relieving depression. (3) The antidepressant drugs are more effective in the age group 51-80 or the antidepressant drugs are least effective in the age group 18-30. (4) The antidepressant drugs are more effective for the age group 31-50 compared to the age group 18-30. The effectiveness of the different antidepressant drugs is the same for the different age groups. (5) Note: There are at least five mandatory conclusions in this problem. 9|Pa ge 1 4 Anton A. Romero IV. ASSESSMENT A. Pre-class Activity: Answer the following questions and refer to the scoring rubric below: Scoring Rubric Criteria Analyzing and Interpreting Data (Reading between and beyond the data) 1 (Idiosyncratic Reasoning) The findings based on the statistical result is incorrect. 2 (Transitional Reasoning) The findings based on the statistical result is correct or make conclusion that is primarily based on the data may be only partially reasonable. 3 (Quantitative Reasoning) Makes reasonable conclusion based on the findings and the context. 4 (Analytical Reasoning) Makes reasonable conclusion based on the findings and the context and provide additional reasonable perspective. 1. Researchers have sought to examine the effect of various types of music on agitation levels in patients who are in the early and middle stages of Alzheimer's disease. Patients were selected to participate in the study based on their stage of Alzheimer's disease. Three forms of music were tested: Easy listening, Mozart, and piano interludes. While listening to music, agitation levels were recorded for the patients with a high score indicating a higher level of agitation. Assume that all the assumptions are satisfied. Use α = .05. Below are the results of ANOVA and the Pos hoc test. 10 | P a g e 1 4 Anton A. Romero B. Grouping Activity Direction: Given the problem/data, compute using the Minitab software. Assume that all the assumptions in using a parametric test are satisfied. Refer to the scoring rubric below. One item will be assigned for each group. Assume that all assumptions are satisfied. SCORING RUBRIC Criteria Analyzing and Interpreting Data (Reading between and beyond the data) Statistical Result Presentation 1 (Idiosyncratic Reasoning) The findings based on the statistical result is incorrect. 2 3 (Transitional (Quantitative Reasoning) Reasoning) The findings Makes reasonable based on the conclusion based statistical result is on the findings correct or make and the context. conclusion that is primarily based on the data may be only partially reasonable. Statistical Result and Presentation Significant errors While there may The statistical or deficiencies in be minor areas analysis is highly the statistical for improvement, accurate, with analysis indicate the statistical complete results. a need for analysis is substantial generally revision and accurate and improvement to thorough, with meet acceptable few errors or standards of omissions. accuracy. Significant While there may The presentation problems with be some areas for is exceptionally conciseness and improvement, the clear and concise, clarity reduce the presentation engaging the presentation's generally audience with effectiveness. achieves clarity well-structured and conciseness, content and effectively minimal conveying key unnecessary information to detail. the audience. 4 (Analytical Reasoning) Makes reasonable conclusion based on the findings and the context and provide additional reasonable perspective. 1. A medical researcher wishes to test the effects of two different diets and two different exercise programs on the glucose level in a person’s blood. The glucose level is measured in milligrams per deciliter (mg/dl). Three subjects are randomly assigned to each group. Assume that all the assumptions are satisfied. Use α = .05. 11 | P a g e 1 4 Anton A. Romero 2. Vermont maple sugar producers sponsored a testing program to determine the benefit of a new fertilizer. A random sample of 27 maple trees in Vermont were chosen and treated with one of three levels of fertilizer. In this experimental setup, nine trees (three in each of three climatic zones) were treated with each fertilizer level and the amount of sap produced (in milliliters) by the trees was measured. Sap is a body fluid (such as blood) essential to life, health, or vigor. Assume that all the assumptions are satisfied. Use α = .05. The results are as follows. 3. An agricultural scientist wants to determine how the type of fertilizer and the type of soil affect the yield of oranges in an orange grove. He has two types of fertilizer and three types of soil. For each of the six combinations of fertilizer and soil, the scientist plants four stands of trees and measures the yield of oranges (in tons per acre) from each stand. Assume that all the assumptions are satisfied. Use α = .05. The data are shown in the following table. 4. A hospital doctor wished to compare the effectiveness of 4 brands of painkillers A, B, C and D. She arranged that when patients on a surgical ward requested painkillers they would be asked if their pain is mild, severe or very severe. The first patient who said mild would be given brand A, the second who said mild would be given brand B, the third brand C and the fourth brand D. Painkillers would be allocated in the same way to the first four patients who said their pain was very severe. The patients were then asked to record the time in minutes, for which the painkillers were effective. The following data were collected. Assume that all the assumptions are satisfied. Use α = .05. 5. A botanist wants to know whether or not plant growth is influenced by sunlight exposure and watering frequency. She plants 40 seeds and lets them grow for two months under different conditions for sunlight exposure and watering frequency. After two months, she records the height of each plant. The results are shown at the right. Assume that all the assumptions are satisfied. Use α = .05. 12 | P a g e 1 4 Anton A. Romero V. REFERENCES Books Abbott, M. L., (2017). Using Statistics In The Social And Health Sciences With Spss® And Excel®. John Wiley & Sons, Inc Bluman, A. G., (2009). Elementary Statistics: A Step by Step Approach (Eight Edition). McGrawHill Chaudhary, K., (2020). Introduction to Biotechnology and Biostatistics. Delve Publishing Ho, R., (2018). Understanding Statistics for the Social Sciences with IBM SPSS. Taylor & Francis Group, LLC Navidi, W. & Monk, B., (2019). Elementaty Statistics (Third Edition). McGraw-Hill Education Internet Source and Related Studies ANOVA Examples. (n.d.). https://www.people.vcu.edu/~wsstreet/courses/314_20033/Examples.ANOVA.pdf ANOVA Test Types, Table, Formula, https://www.cuemath.com/anova-formula/ Examples. (2021). Cuemath. http://eagri.org/eagri50/STAM101/pdf/pract07.pdf https://www.cimt.org.uk/projects/mepres/alevel/fstats_ch7.pdf Indoria, A. K., Sharma, K. L., Reddy, K. S., & Rao, C. S. (2017). Role of soil physical properties in soil health management and crop productivity in rainfed systems-I: Soil physical constraints and scope. Current science, 2405-2414. 13 | P a g e 1 4 Anton A. Romero https://www.kaggle.com/ Mathew, T. K., & Tadi, P. (2020). Blood glucose monitoring. Utah State University. (2024). What is Iron Chlorosis and What Causes it? | Forestry | Extension. Usu.edu. https://extension.usu.edu/forestry/trees-cities-towns/tree-care/causes-ironchlorosis#:~:text=The%20primary%20symptom%20of%20iron,as%20the%20plant%20c ells%20die. Prepared by: ANTON A. ROMERO Mathematics Instructor 14 | P a g e 1 4 Anton A. Romero