Statistical Methods in Business Assignment: ANOVA & Kruskal-Wallis

Assignment 3 (Week 7 - 8) ADM 2304 - Applications of Statistical Methods in Business Telfer School of Management, University of Ottawa Due Date and Time: Wednesday, 22 November 2023, 11:59 pm 1 Q1: Weight Loss A pharmaceutical company would like to test the efficacy of 4 new weight loss medications. They selected a random sample of 24 patients, placed them on diet, and assigned each weight loss medication to a group of 6 patients for a three-month period. The following table presents the weight loss (in lbs) by 4 types of medications. Drug 1 5 12 9 7 10 7 Drug 2 Drug 3 Drug 4 8 10 13 7 11 13 10 9 11 5 14 8 10 16 10 8 12 11 The pharmaceutical company would like to determine if there is a difference in average weight loss achieved among the 4 different medications. (a) Identify the experimental design. The experimental design in this case is a randomized design, the patients in the sample are randomly selected and assigned to either Drug 1 to Drug 4, with each of them given a different weight loss medicine. Moreover, this is a experimental study since the experiment is a cause-and-effect effect relationship, for example, the study is looking at the average weight loss achieved after the 4 different medications taken. (b) Graph the data using side-by-side box plots. Are the assumptions of ANOVA met here? Explain. Assumptions Of ANOVA: The population variances are equal, the observations are independent, occurrence of any of the drug test does not affect the probability of any other drug test observation occurring. The data are ratio level. Randomized samples specified. 2 (c) Perform an appropriate ANOVA test to determine whether there is a difference in average weight loss among 4 different medications using both critical value and pvalue approach. Use α = 0.05 and describe all necessary steps of test of hypothesis (i.e., hypotheses, test statistic, critical value/p-value, decision with justification). (STATCRUNCH) 𝐻𝑜 : 𝜇1 = 𝜇2 = 𝜇3 = 𝜇4 𝐻𝐴 : 𝑁𝑜𝑡 𝑎𝑙𝑙 𝑚𝑒𝑎𝑛𝑠 𝑎𝑟𝑒 𝑒𝑞𝑢𝑎𝑙 3 Test Statistics, F − Ratio = 23.3333 5.0667 = 4.605 F critical value at 𝑑𝑓 = 3,20 𝑎𝑛𝑑 𝑙𝑒𝑣𝑒𝑙 𝑜𝑓 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑜𝑓 0.05 𝑖𝑠 3.05 Since 4.605>3.05 it can be stated the Null Hypothesis can be rejected which means that there is a significant difference between eh average weight loss achieved in 4 different medications. P-Value Approach: 0.0132 < 0.05, therefore, we reject Null Hypotheses. (d) Calculate the sample standard deviation for each sample and the pooled variance. Compare your pooled variance with the ANOVA output on STATCRUNCH. Do your calculations agree with the STATCRUNCH computations? (Manual Calculation and STATCRUNCH) 𝑆𝑆𝐸 = ∑𝑖=4(𝑛𝑖 − 1)𝑆𝑖2 = (𝑛1 − 1)𝑆12 + (𝑛2 − 1)𝑆22 + (𝑛3 − 1)𝑆32 + (𝑛4 − 1)𝑆42 𝑆𝑆𝐸 = (6 − 1)(2.50)2 + (6 − 1)(1.89)2 + (6 − 1)(2.61)2 + (6 − 1)(1.89)2 = 101.03 𝑆𝑆𝐸 𝑀𝑆𝐸 = 𝑁−𝐾 = 101.03 24−4 = 5.06 Based on the calculations it can be said that the MSE agrees with the STATCRUNCH output. (e) If warranted, use Bonferroni adjusted confidence interval to determine which weight loss medication(s) appears to be more effective and hence are significantly different. Assume the family level of significance α = 0.05. (Manual Calculation) Bonferroni post hoc test is warranted since using the F-ratio test in part c the null hypothesis was rejected which means that there will be a mean that is statistically significant. 4! 4 𝐽 = ( ) = 2!(4−2)! = 6, 𝑡𝛼 = 2.9271 2 1 1 𝑖 𝑗 1 1 𝑆𝑦̅𝑖 −𝑦̅𝑗 = √𝑀𝑆𝐸(𝑛 − 𝑛 ) = √5.0667(6 + 6) = 1.2996 1 1 𝑖 𝑗 𝑦̅1 − 𝑦̅2 ± 𝑡𝛼 √𝑀𝑆𝐸 (𝑛 − 𝑛 ) = 8.3333 − 8 ± 2.9271 ∗ 1.2996 = (−3.47,4.14) 4 1 1 𝑖 𝑗 𝑦̅1 − 𝑦̅3 ± 𝑡𝛼 √𝑀𝑆𝐸 (𝑛 − 𝑛 ) = 8.3333 − 12 ± 2.9271 ∗ 1.2996 = (−7.47,0.14) 1 1 𝑖 𝑗 𝑦̅1 − 𝑦̅4 ± 𝑡𝛼 √𝑀𝑆𝐸 (𝑛 − 𝑛 ) = 8.3333 − 11 ± 2.9271 ∗ 1.2996 = (−6.47,1.14) 1 1 𝑖 𝑗 1 1 𝑖 𝑗 1 1 𝑖 𝑗 𝑦̅2 − 𝑦̅3 ± 𝑡𝛼 √𝑀𝑆𝐸(𝑛 − 𝑛 ) =8 − 12 ± 2.9271 ∗ 1.2996 = (−7.80, −0.20) 𝑦̅2 − 𝑦̅4 ± 𝑡𝛼 √𝑀𝑆𝐸 (𝑛 − 𝑛 ) = 8 − 11 ± 2.9271 ∗ 1.2996 = (−6.80,0.80) 𝑦̅3 − 𝑦̅4 ± 𝑡𝛼 √𝑀𝑆𝐸 (𝑛 − 𝑛 ) = 12 − 11 ± 2.9271 ∗ 1.2996 = (−2.80,4.80) Therefore, these is a significant difference in average weight loss between 𝑦̅2 − 𝑦̅3 at the level of significance of 5%, but not between other pairs. (f) Notwithstanding your answer to part (b), perform Kruskal Wallis nonparametric test to determine whether there is difference among the median weight loss among the 4 medications using critical value approach. Be sure to sate your hypotheses, test statistic, critical value, rank calculation in EXCEL, and decision. (Manual Calculation and EXCEL) 𝐻𝑜 : 𝑇ℎ𝑒 4 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛𝑠 𝑎𝑟𝑒 𝑖𝑑𝑒𝑛𝑡𝑖𝑎𝑙 𝐻𝐴 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛𝑠 𝑑𝑖𝑓𝑓𝑒𝑟 𝑖𝑛 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 𝑇2 12 𝐻 = 𝑁(𝑁+1) ∑𝑘𝑖=1 𝑛𝑖 − 3(𝑁 + 1) 𝑖 12 = 24(24+1) ( (51.5)2 6 + (45.5)2 6 + (106)2 6 + (97)2 6 ) − 3(24 + 1) = 9.558 2 2 df = k − 1 = 4 − 1 = 3 𝑥𝛼,𝑑𝑓=𝑘−1 = 𝑥0.05,3 = 7.815 𝐻 = 9.558 > 7.815 Reject the Null Hypothesis at the 0.5 level of significance. 5 (g) Perform the Kruskal Wallis nonparametric test using STATCRUNCH, include your output and give your decision using p-value approach. Do you have the same result as manual calculation done in (f)? (STATCRUNCH) 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.0212 < 0.05 Therefore, reject null hypothesis, which means that there is significant information to support the conclusion that there is a difference in the average weight loss achieved among the 4 different test groups 6 Q2: Quality of Production A production manager suspects the quality of production is affected by both supplier (A, B, C) of the production material and the shift (Day, Night, Swing) the product was produced. For this experiment, the manger randomly selects 5 quality scores of the production for each combination of supplier and shift which are given in the following table. Supplier A Supplier B Supplier C Day 78 79 81 76 75 80 82 78 78 81 82 83 84 79 78 Night Swing 91 75 89 74 78 79 87 80 82 81 88 85 87 72 89 80 76 75 80 81 81 79 75 79 89 81 78 80 76 82 (a) Identify the experimental design, number of factors along with levels, number of treatments, and number of replications. The number of factors is supplier and shift. The factor “Supplier” are given the levels of “A”, “B”, “C”, for shift the factors are “Day”, “Night”, and “Swing”. Number of treatments is 9 since its 3 suppliers multiply by the 3 shift levels = 9 treatments. Factor A has 3 levels and factor b has 3 levels. For number of replications, it is 5 quality scores for each combination of supplier and shift, which means there are 5 replications for each treatment. 7 (b) Do the data provide sufficient evidence to indicate an interaction between Supplier and Shift? Conduct an appropriate test of hypothesis at 5% level of significance using both critical value and p-value approach. Please provide hypotheses, test statistic, critical value, p-value, and decision with justification. (STATCRUNCH) A x B Interaction Hypothesis: 𝐻𝑂 : 𝑁𝑜 𝑖𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝐹𝑎𝑐𝑡𝑜𝑟𝑠 𝑆𝑢𝑝𝑝𝑙𝑖𝑒𝑟 𝑎𝑛𝑑 𝐹𝑎𝑐𝑡𝑜𝑟 𝑆ℎ𝑖𝑓𝑡 𝐻𝐴 : 𝐼𝑛𝑡𝑒𝑟𝑎𝑐𝑡𝑖𝑜𝑛 𝑒𝑥𝑖𝑠𝑡𝑠 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝐹𝑎𝑐𝑡𝑜𝑟 𝑆𝑢𝑝𝑝𝑙𝑖𝑒𝑟 𝑎𝑛𝑑 𝐹𝑎𝑐𝑡𝑜𝑟 𝑆ℎ𝑖𝑓𝑡 𝐹𝐴𝑥𝐵 = 𝑀𝑆 𝑀𝐴𝑋𝐵 𝑤𝑖𝑡ℎ𝑖𝑛 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 = 1.962 16.2 = 0.121 𝑑𝑓 = 4,36 𝑎𝑡 𝑡ℎ𝑒 𝑙𝑒𝑣𝑒𝑙 𝑜𝑓 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑎𝑛𝑐𝑒 0.05 = 2.64 Since 𝐹 − 𝑅𝑎𝑡𝑖𝑜 < 𝐹𝑐𝑣,𝛼 , 𝑑𝑜 𝑛𝑜𝑡 𝑟𝑒𝑗𝑒𝑐𝑡 𝑛𝑢𝑙𝑙 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 P-Value approach 0.1212 > 0.05, do not reject Ho , at the 0.05 level of significance. (c) Test at 5% level of significance the difference in average quality of production among three suppliers using critical value approach. (STATCRUNCH)CVA 𝐻𝑜 : 𝜇𝐴1 = 𝜇𝐴2 = 𝜇𝐴3 𝐻𝐴 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛𝑠 𝑓𝑜𝑟 𝐹𝑎𝑐𝑡𝑜𝑟 𝑆𝑢𝑝𝑝𝑙𝑖𝑒𝑟 𝑚𝑒𝑎𝑛 𝑎𝑟𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝐹 − 𝑆𝑡𝑎𝑡 = 0.0589 < 2.64, 𝑑𝑜 𝑛𝑜𝑡 𝑟𝑒𝑗𝑒𝑐𝑡 𝑡ℎ𝑒 𝑛𝑢𝑙𝑙 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 8 (d) Test at 5% level of significance the difference in average quality of production among three shifts using critical value approach. (STATCRUNCH) CVA 𝐻𝑜 : 𝜇𝐵1 = 𝜇𝐵2 = 𝜇𝐵3 𝐻𝐴 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑓𝑜𝑟 𝐹𝑎𝑐𝑡𝑜𝑟 𝑆ℎ𝑖𝑓𝑡 𝑚𝑒𝑎𝑛 𝑎𝑟𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝐹 − 𝑆𝑡𝑎𝑡 = 4.66 > 3.27, 𝑟𝑒𝑗𝑒𝑐𝑡 𝑡ℎ𝑒 𝑛𝑢𝑙𝑙 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠. (e) Plot the residuals against the fitted values. What key model assumptions can be examined and do these appear to be warranted? (STATCRUNCH) Normality: There are points outside of the ± three standard deviations which means there are outliers, thus there are outliers, so normality assumptions is valid. Equal Variance: The values are equally distributed so that the assumption is met since they are scattered in the graph compared to being clustered. 9 (f) Calculate 95% Bonferroni margin of error for the confidence intervals based on all the pairwise differences between the average quality of production. Now, test the difference between the mean quality of production for the following two treatments (Manual Calculation): Level of Significance =1-0.95= 0.05 𝑎𝑏 𝐽 = ( ) = 𝐶2𝑎𝑏 = 36 2 (i) (Supplier B, Night) versus (Supplier B, Swing). 1 1 𝑛𝑖 𝑛𝑗 𝑦̅𝑖 − 𝑦̅𝑗 ± 𝑡𝛼 ∗ √𝑀𝑆𝐸 ( + 1 1 5 5 ) = (84 − 78.6) ±∗ 3.71√16.2( + ) = 5.4 ± 9.44 = (−4.04,14.84) Since the confidence interval does contain 0, the difference is significant. (ii) (Supplier A, Night) vs (Supplier C, Night). 1 1 𝑖 𝑗 1 1 𝑦̅𝑖 − 𝑦̅𝑗 ± 𝑡𝛼 ∗ √𝑀𝑆𝐸 (𝑛 + 𝑛 ) = (85.4 − 79.8) = 3.71√16.2(5 + 5) = 4.2 ± 9.44 = (−5.24,13.64) Since the confidence interval does contain 0, the difference is significant. Now, draw two interaction plots (Plot 1: Supplier (x axis) vs Shift (y axis); Plot 2: Shift (x axis) vs Supplier (y axis)) using STATCRUNCH and verify whether you notice the same as you concluded in the above two comparisons (STATCRUNCH). 10 Based on the two comparison graphs it can be concluded that they are significant, which is the same conclusion that was drawn to in part f. 11

Statistical Methods in Business Assignment: ANOVA & Kruskal-Wallis

Related documents

Products

Support

Statistical Methods in Business Assignment: ANOVA & Kruskal-Wallis

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib