Uploaded by oliver.lu42001

ADM2304 Assignment 3 F23 (2) (2)

advertisement
Assignment 3 (Week 7 - 8)
ADM 2304 - Applications of Statistical Methods in Business
Telfer School of Management, University of Ottawa
Due Date and Time: Wednesday, 22 November 2023, 11:59 pm
1
Q1: Weight Loss
A pharmaceutical company would like to test the efficacy of 4 new weight loss medications.
They selected a random sample of 24 patients, placed them on diet, and assigned each weight
loss medication to a group of 6 patients for a three-month period. The following table presents
the weight loss (in lbs) by 4 types of medications.
Drug 1
5
12
9
7
10
7
Drug 2 Drug 3 Drug 4
8
10
13
7
11
13
10
9
11
5
14
8
10
16
10
8
12
11
The pharmaceutical company would like to determine if there is a difference in average weight
loss achieved among the 4 different medications.
(a) Identify the experimental design.
The experimental design in this case is a randomized design, the patients in the sample are randomly
selected and assigned to either Drug 1 to Drug 4, with each of them given a different weight loss medicine.
Moreover, this is a experimental study since the experiment is a cause-and-effect effect relationship, for
example, the study is looking at the average weight loss achieved after the 4 different medications taken.
(b) Graph the data using side-by-side box plots. Are the assumptions of ANOVA met here?
Explain.
Assumptions Of ANOVA:
The population variances are equal, the observations are independent, occurrence of any
of the drug test does not affect the probability of any other drug test observation occurring.
The data are ratio level. Randomized samples specified.
2
(c) Perform an appropriate ANOVA test to determine whether there is a difference in
average weight loss among 4 different medications using both critical value and pvalue approach. Use α = 0.05 and describe all necessary steps of test of hypothesis (i.e., hypotheses, test statistic, critical value/p-value, decision with justification).
(STATCRUNCH)
π»π‘œ : πœ‡1 = πœ‡2 = πœ‡3 = πœ‡4
𝐻𝐴 : π‘π‘œπ‘‘ π‘Žπ‘™π‘™ π‘šπ‘’π‘Žπ‘›π‘  π‘Žπ‘Ÿπ‘’ π‘’π‘žπ‘’π‘Žπ‘™
3
Test Statistics, F − Ratio =
23.3333
5.0667
= 4.605
F critical value at 𝑑𝑓 = 3,20 π‘Žπ‘›π‘‘ 𝑙𝑒𝑣𝑒𝑙 π‘œπ‘“ π‘ π‘–π‘”π‘›π‘–π‘“π‘–π‘π‘Žπ‘›π‘π‘’ π‘œπ‘“ 0.05 𝑖𝑠 3.05
Since 4.605>3.05 it can be stated the Null Hypothesis can be rejected which means that
there is a significant difference between eh average weight loss achieved in 4 different
medications.
P-Value Approach:
0.0132 < 0.05, therefore, we reject Null Hypotheses.
(d) Calculate the sample standard deviation for each sample and the pooled variance.
Compare your pooled variance with the ANOVA output on STATCRUNCH. Do your
calculations agree with the STATCRUNCH computations? (Manual Calculation
and STATCRUNCH)
𝑆𝑆𝐸 = ∑𝑖=4(𝑛𝑖 − 1)𝑆𝑖2 = (𝑛1 − 1)𝑆12 + (𝑛2 − 1)𝑆22 + (𝑛3 − 1)𝑆32 + (𝑛4 − 1)𝑆42
𝑆𝑆𝐸 = (6 − 1)(2.50)2 + (6 − 1)(1.89)2 + (6 − 1)(2.61)2 + (6 − 1)(1.89)2 = 101.03
𝑆𝑆𝐸
𝑀𝑆𝐸 = 𝑁−𝐾 =
101.03
24−4
= 5.06
Based on the calculations it can be said that the MSE agrees with the STATCRUNCH output.
(e) If warranted, use Bonferroni adjusted confidence interval to determine which weight
loss medication(s) appears to be more effective and hence are significantly different.
Assume the family level of significance α = 0.05. (Manual Calculation)
Bonferroni post hoc test is warranted since using the F-ratio test in part c the null
hypothesis was rejected which means that there will be a mean that is statistically
significant.
4!
4
𝐽 = ( ) = 2!(4−2)! = 6, 𝑑𝛼 = 2.9271
2
1
1
𝑖
𝑗
1
1
𝑆𝑦̅𝑖 −𝑦̅𝑗 = √𝑀𝑆𝐸(𝑛 − 𝑛 ) = √5.0667(6 + 6) = 1.2996
1
1
𝑖
𝑗
𝑦̅1 − 𝑦̅2 ± 𝑑𝛼 √𝑀𝑆𝐸 (𝑛 − 𝑛 ) = 8.3333 − 8 ± 2.9271 ∗ 1.2996 = (−3.47,4.14)
4
1
1
𝑖
𝑗
𝑦̅1 − 𝑦̅3 ± 𝑑𝛼 √𝑀𝑆𝐸 (𝑛 − 𝑛 ) = 8.3333 − 12 ± 2.9271 ∗ 1.2996 = (−7.47,0.14)
1
1
𝑖
𝑗
𝑦̅1 − 𝑦̅4 ± 𝑑𝛼 √𝑀𝑆𝐸 (𝑛 − 𝑛 ) = 8.3333 − 11 ± 2.9271 ∗ 1.2996 = (−6.47,1.14)
1
1
𝑖
𝑗
1
1
𝑖
𝑗
1
1
𝑖
𝑗
𝑦̅2 − 𝑦̅3 ± 𝑑𝛼 √𝑀𝑆𝐸(𝑛 − 𝑛 ) =8 − 12 ± 2.9271 ∗ 1.2996 = (−7.80, −0.20)
𝑦̅2 − 𝑦̅4 ± 𝑑𝛼 √𝑀𝑆𝐸 (𝑛 − 𝑛 ) = 8 − 11 ± 2.9271 ∗ 1.2996 = (−6.80,0.80)
𝑦̅3 − 𝑦̅4 ± 𝑑𝛼 √𝑀𝑆𝐸 (𝑛 − 𝑛 ) = 12 − 11 ± 2.9271 ∗ 1.2996 = (−2.80,4.80)
Therefore, these is a significant difference in average weight loss between 𝑦̅2 − 𝑦̅3 at the level
of significance of 5%, but not between other pairs.
(f) Notwithstanding your answer to part (b), perform Kruskal Wallis nonparametric test to
determine whether there is difference among the median weight loss among the 4 medications using critical value approach. Be sure to sate your hypotheses, test statistic,
critical value, rank calculation in EXCEL, and decision. (Manual Calculation and
EXCEL)
π»π‘œ : π‘‡β„Žπ‘’ 4 π‘π‘œπ‘π‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›π‘  π‘Žπ‘Ÿπ‘’ π‘–π‘‘π‘’π‘›π‘‘π‘–π‘Žπ‘™
𝐻𝐴 : 𝐴𝑑 π‘™π‘’π‘Žπ‘ π‘‘ π‘‘π‘€π‘œ π‘œπ‘“ π‘‘β„Žπ‘’ π‘π‘œπ‘π‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›π‘  π‘‘π‘–π‘“π‘“π‘’π‘Ÿ 𝑖𝑛 π‘™π‘œπ‘π‘Žπ‘‘π‘–π‘œπ‘›
𝑇2
12
𝐻 = 𝑁(𝑁+1) ∑π‘˜π‘–=1 𝑛𝑖 − 3(𝑁 + 1)
𝑖
12
= 24(24+1) (
(51.5)2
6
+
(45.5)2
6
+
(106)2
6
+
(97)2
6
) − 3(24 + 1) = 9.558
2
2
df = k − 1 = 4 − 1 = 3 π‘₯𝛼,𝑑𝑓=π‘˜−1
= π‘₯0.05,3
= 7.815
𝐻 = 9.558 > 7.815
Reject the Null Hypothesis at the 0.5 level of significance.
5
(g) Perform the Kruskal Wallis nonparametric test using STATCRUNCH, include your
output and give your decision using p-value approach. Do you have the same result
as manual calculation done in (f)? (STATCRUNCH)
𝑝 − π‘£π‘Žπ‘™π‘’π‘’ = 0.0212 < 0.05
Therefore, reject null hypothesis, which means that there is significant information to support the
conclusion that there is a difference in the average weight loss achieved among the 4 different test
groups
6
Q2: Quality of Production
A production manager suspects the quality of production is affected by both supplier (A, B,
C) of the production material and the shift (Day, Night, Swing) the product was produced. For
this experiment, the manger randomly selects 5 quality scores of the production for each
combination of supplier and shift which are given in the following table.
Supplier A
Supplier B
Supplier C
Day
78
79
81
76
75
80
82
78
78
81
82
83
84
79
78
Night Swing
91
75
89
74
78
79
87
80
82
81
88
85
87
72
89
80
76
75
80
81
81
79
75
79
89
81
78
80
76
82
(a) Identify the experimental design, number of factors along with levels, number of treatments, and number of replications.
The number of factors is supplier and shift. The factor “Supplier” are given the levels
of “A”, “B”, “C”, for shift the factors are “Day”, “Night”, and “Swing”. Number of
treatments is 9 since its 3 suppliers multiply by the 3 shift levels = 9 treatments. Factor
A has 3 levels and factor b has 3 levels. For number of replications, it is 5 quality
scores for each combination of supplier and shift, which means there are 5 replications
for each treatment.
7
(b) Do the data provide sufficient evidence to indicate an interaction between Supplier
and Shift? Conduct an appropriate test of hypothesis at 5% level of significance using
both critical value and p-value approach. Please provide hypotheses, test statistic,
critical value, p-value, and decision with justification. (STATCRUNCH)
A x B Interaction Hypothesis:
𝐻𝑂 : π‘π‘œ π‘–π‘›π‘‘π‘’π‘Ÿπ‘Žπ‘π‘‘π‘–π‘œπ‘› 𝑏𝑒𝑑𝑀𝑒𝑒𝑛 πΉπ‘Žπ‘π‘‘π‘œπ‘Ÿπ‘  π‘†π‘’π‘π‘π‘™π‘–π‘’π‘Ÿ π‘Žπ‘›π‘‘ πΉπ‘Žπ‘π‘‘π‘œπ‘Ÿ π‘†β„Žπ‘–π‘“π‘‘
𝐻𝐴 : πΌπ‘›π‘‘π‘’π‘Ÿπ‘Žπ‘π‘‘π‘–π‘œπ‘› 𝑒π‘₯𝑖𝑠𝑑𝑠 𝑏𝑒𝑑𝑀𝑒𝑒𝑛 πΉπ‘Žπ‘π‘‘π‘œπ‘Ÿ π‘†π‘’π‘π‘π‘™π‘–π‘’π‘Ÿ π‘Žπ‘›π‘‘ πΉπ‘Žπ‘π‘‘π‘œπ‘Ÿ π‘†β„Žπ‘–π‘“π‘‘
𝐹𝐴π‘₯𝐡 = 𝑀𝑆
𝑀𝐴𝑋𝐡
π‘€π‘–π‘‘β„Žπ‘–π‘› π‘‘π‘Ÿπ‘’π‘Žπ‘‘π‘šπ‘’π‘›π‘‘π‘ 
=
1.962
16.2
= 0.121
𝑑𝑓 = 4,36 π‘Žπ‘‘ π‘‘β„Žπ‘’ 𝑙𝑒𝑣𝑒𝑙 π‘œπ‘“ π‘ π‘–π‘”π‘›π‘–π‘“π‘–π‘Žπ‘›π‘π‘’ 0.05 = 2.64
Since 𝐹 − π‘…π‘Žπ‘‘π‘–π‘œ < 𝐹𝑐𝑣,𝛼 , π‘‘π‘œ π‘›π‘œπ‘‘ π‘Ÿπ‘’π‘—π‘’π‘π‘‘ 𝑛𝑒𝑙𝑙 β„Žπ‘¦π‘π‘œπ‘‘β„Žπ‘’π‘ π‘–π‘ 
P-Value approach
0.1212 > 0.05, do not reject Ho , at the 0.05 level of significance.
(c) Test at 5% level of significance the difference in average quality of production among
three suppliers using critical value approach. (STATCRUNCH)CVA
π»π‘œ : πœ‡π΄1 = πœ‡π΄2 = πœ‡π΄3
𝐻𝐴 : 𝐴𝑑 π‘™π‘’π‘Žπ‘ π‘‘ π‘‘π‘€π‘œ π‘œπ‘“ π‘‘β„Žπ‘’ π‘π‘œπ‘π‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›π‘  π‘“π‘œπ‘Ÿ πΉπ‘Žπ‘π‘‘π‘œπ‘Ÿ π‘†π‘’π‘π‘π‘™π‘–π‘’π‘Ÿ π‘šπ‘’π‘Žπ‘› π‘Žπ‘Ÿπ‘’ π‘‘π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘‘
𝐹 − π‘†π‘‘π‘Žπ‘‘ = 0.0589 < 2.64, π‘‘π‘œ π‘›π‘œπ‘‘ π‘Ÿπ‘’π‘—π‘’π‘π‘‘ π‘‘β„Žπ‘’ 𝑛𝑒𝑙𝑙 β„Žπ‘¦π‘π‘œπ‘‘β„Žπ‘’π‘ π‘–π‘ 
8
(d) Test at 5% level of significance the difference in average quality of production among
three shifts using critical value approach. (STATCRUNCH) CVA
π»π‘œ : πœ‡π΅1 = πœ‡π΅2 = πœ‡π΅3
𝐻𝐴 : 𝐴𝑑 π‘™π‘’π‘Žπ‘ π‘‘ π‘‘π‘€π‘œ π‘œπ‘“ π‘‘β„Žπ‘’ π‘π‘œπ‘π‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› π‘“π‘œπ‘Ÿ πΉπ‘Žπ‘π‘‘π‘œπ‘Ÿ π‘†β„Žπ‘–π‘“π‘‘ π‘šπ‘’π‘Žπ‘› π‘Žπ‘Ÿπ‘’ π‘‘π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘‘
𝐹 − π‘†π‘‘π‘Žπ‘‘ = 4.66 > 3.27, π‘Ÿπ‘’π‘—π‘’π‘π‘‘ π‘‘β„Žπ‘’ 𝑛𝑒𝑙𝑙 β„Žπ‘¦π‘π‘œπ‘‘β„Žπ‘’π‘ π‘–π‘ .
(e) Plot the residuals against the fitted values. What key model assumptions can be
examined and do these appear to be warranted? (STATCRUNCH)
Normality: There are points outside of the ± three standard deviations which means there are
outliers, thus there are outliers, so normality assumptions is valid.
Equal Variance: The values are equally distributed so that the assumption is met since they are
scattered in the graph compared to being clustered.
9
(f) Calculate 95% Bonferroni margin of error for the confidence intervals based on all
the pairwise differences between the average quality of production. Now, test the
difference between the mean quality of production for the following two treatments
(Manual Calculation):
Level of Significance =1-0.95= 0.05
π‘Žπ‘
𝐽 = ( ) = 𝐢2π‘Žπ‘ = 36
2
(i) (Supplier B, Night) versus (Supplier B, Swing).
1
1
𝑛𝑖
𝑛𝑗
𝑦̅𝑖 − 𝑦̅𝑗 ± 𝑑𝛼 ∗ √𝑀𝑆𝐸 ( +
1
1
5
5
) = (84 − 78.6) ±∗ 3.71√16.2( + )
= 5.4 ± 9.44 = (−4.04,14.84)
Since the confidence interval does contain 0, the difference is significant.
(ii) (Supplier A, Night) vs (Supplier C, Night).
1
1
𝑖
𝑗
1
1
𝑦̅𝑖 − 𝑦̅𝑗 ± 𝑑𝛼 ∗ √𝑀𝑆𝐸 (𝑛 + 𝑛 ) = (85.4 − 79.8) = 3.71√16.2(5 + 5)
= 4.2 ± 9.44 = (−5.24,13.64)
Since the confidence interval does contain 0, the difference is significant.
Now, draw two interaction plots (Plot 1: Supplier (x axis) vs Shift (y axis); Plot 2:
Shift (x axis) vs Supplier (y axis)) using STATCRUNCH and verify whether you
notice the same as you concluded in the above two comparisons (STATCRUNCH).
10
Based on the two comparison graphs it can be concluded that they are significant, which is
the same conclusion that was drawn to in part f.
11
Download