Part 1

advertisement
Simple Comparative Experiments – Introduction to ANOVA
Read Sections 2.1, 3.1 – 3.3 in the text
Note: These notes were modified from lecture notes created by Tisha Hooks and Christopher Malone.
Example: Let’s consider an experiment that was conducted to study the bond strength of a Portland cement mortar.
The data points given below were sampled from the Tension Bond Strength experiment discussed on pages 25 – 26 in
your text. Enter these points in Excel as shown below:
Also, I’ve included the data points on the number line below.
Questions:
1. Does the modified formula seem to improve tension bond strength? Discuss.
2. What is the factor under consideration? What is the response under consideration?
3. Compute the average for each group and sketch the averages on the plot above.
Average for modified: ________________
Average for unmodified: ________________
4. Does there appear to be a considerable difference in tension bond strength between the two groups in this
experiment? Discuss.
1
Measuring the effect of a factor
Ultimately, we are interested in determining whether the factor in the experiment has an effect on the response
variable. For the above example in particular, we’re interested in whether the cement mortar is modified or
unmodified has any impact on the bond tension strength. To investigate whether the type of mortar has an impact
on tension bond strength, let’s first look at the scenario assuming group (type of cement mortar) has _____ effect on
tension bond strength. If that is the case, we would not expect there to be difference between the two groups; that
is we could ignore group all together.
This scenario is depicted by the number line below.
Questions:
5. Calculate the average of these points and plot it on the number line above. Note, this is our best guess for
the tension bond strength under the assumption group has no effect.
Average of all observations: _________________
In statistics we use the concept of ________________ to determine which factor(s) are significant. For example, if
the amount of error is __________________ significantly when we consider the groups, then we can say the groups
are important and that the factor under consideration has some effect on the response variable. On the other hand,
if the amount of error is not reduced much by considering the factor, then the factor is said to be statistically
___important.
6. Sketch the amount of “error” for each observation on the above number line (ignoring groups).
Error observation #1: _________________________________ = __________________
Error observation #2: _________________________________ = __________________
Error Observation #3: _________________________________ = __________________
Error Observation #4: _________________________________ = __________________
2
7. What is the total amount of “error”?
8. What problems exist with computing the total amount of “error” in this manner?
9. How might we overcome this problem?
Computing the total amount of error assuming group has NO effect
10. The total amount of error assuming groups has NO effect = ____________________________.
Computing the total amount of error assuming group HAS an effect
Next, let’s suppose that factor __________ have an effect on tension bond strength. In this case, we WOULD expect
there to be differences between the two groups; that is, we should ______ ignore group!
3
Fill in the following information ASSUMING that factor IS important.
11. What is the total amount of error assuming groups HAVE an effect?
12. What do we gain by considering group? That is, how much is the total amount of error reduced after we
account for the group effect?
Difference: ________________________________
Recall, that if the amount of error is reduced significantly when we consider the groups, then we say the groups are
important and the factor under consideration has a significant effect. On the other hand, if the amount of error is
not reduced much by considering the factor, then the factor is said to be statistically unimportant.
13. Consider the difference found in Question 12 above. Is this large enough to say that group (type of cement
mortar) has a significant statistical effect?
To answer this question, we will use the statistical procedure known as Analysis of Variance (ANOVA).
We’ll now take a look at the ANOVA procedure.
4
Analysis of Variance (ANOVA)
The analysis of variance procedure is derived from partitioning the total variability into two components. Suppose
that _____ is the jth observation taken under factor level i. In general, we have _____ factor levels and _____
observations under the ith treatment.
Label the observations from the cement mortar example using this notation.
Partitioning the Sums of Squares
Though we haven’t used the terminology or seen the formulas given below, we have already calculated each of these
____________________________ for the cement mortar example.




Total Sum of Squares
o
SST =
o
This was calculated based on the assumption groups had _____ effect.
Error Sum of Squares
o
SSE =
o
This was calculated based on the assumption groups ______ an effect.
Treatment Sum of Squares
o
SSTrt =
o
This represents the _________________ in our measure of experimental error after accounting for
groups.
Note that ________________________________. That is, we have partitioned the _______________ Sum of
Squares into two parts: that variation _____________ factor level means (between treatments) and the
variation ____________________ treatments, i.e. due to experimental error.
5
Degrees of Freedom
The degrees of freedom (df) associated with each sum of squares can be regarded as the number of ______________
elements in the sum of squares. The df for each sum of squares is given below.

Total Sum of Squares
o
o

df =
Error Sum of Squares
o
o

df =
Treatment Sum of Squares
o
o
df =
Mean Squares
These are obtained by dividing the sum of squares by its associated degrees of freedom. Therefore, we get:

Treatment Mean Square
o

MSTrt =
Mean Square Error
o
MSE =
F-Statistic
F=
6
Questions:
14. What does it mean if the F-statistic is large?
15. What does it mean if the F-statistic is small, say close to 1?
p-value
Recall, if the p-value is less than some predetermined error rate, usually ______, then the data is said to support the
alternative hypothesis (i.e. there is statistically significant evidence for the research question).
H0:
Ha:
Carrying out the ANOVA in Minitab
Enter the data into a new worksheet as follows.
If you want to construct a dotplot of the data, select Graph  Dotplot… To make a dotplot with groups, choose the
following, and then enter the information as shown.
7
Click OK and you should get the following dotplot.
To obtain the ANOVA, choose Stat  ANOVA  General Linear Model. Next, enter the information as shown
below.
After clicking OK, you should get the following output.
Questions:
16. Find the sums of squares in the output.
a. SST =
b. SSE =
c. SSTrt =
8
17. Find the degrees of freedom in the output.
a. df Total =
b. df Error =
c. df Treatment =
18. Find the mean squares in the output.
a. MSE =
b. MSTrt =
19. Find the F-statistic in the output.
20. Find the p-value in the output.
21. What do we conclude from this study?
We’ve just been using a sample of the data collected from this study. Let’s take a look at the analysis using the
complete data set. The data can be found on the course website in the file cement_mortar.mpj.
Questions:
22. Create a dotplot of the data by group. Looking at the dotplot, does there appear to be a difference in tension
bond strength based upon the type of cement mortar used? Explain your reasoning.
23. Carry out the ANOVA to determine if there is an effect of cement mortar on tension bond strength.
9
Download