ISE362Chapter Ten

advertisement
Chapter Ten
The Analysis Of
Variance
ANOVA Definitions
> Factor
The characteristic that
differentiates the
treatment or populations
from one another.
> Level (Treatments)
The number of different
treatments or populations.
Randomized Experiment
Randomizing the order of
sample observations will
balance out any known or
unknown nuisance variable
that may influence the
observed response.
Mean Square for Treatments
MSTr = J
[(X1 – X)2 +…+ (XI – X)2]
I–1
For I number of levels
For J number of samples
Mean Square for Error
MSE =
2
S
1
+
2
S
I
2+…+
2
S
I
Test Statistic for Single
Factor ANOVA
F = MSTr
MSE
With 1 = I –1 & 2 = I(J-1)
ANOVA on Single-Factor Experiment
Several (I) Means
All Normal (Same 2)
Null Hypothesis: H0: u1 = u2 =…= uI
Test Statistic:  = MSTr / MSE
Alternative Hypothesis:
Ha: (at least two means are not equal)
Reject Region (upper tailed)
  F, I-1, I (J-1)
Exp. (IJ – 1) DOF
ANOVA on Single-Factor Experiment
Experiments were conducted to study whether commercial
processing of various foods changes the concentration of
essential elements for human consumption. One such
experiment was to study the concentration of zinc in green
beans. A batch of green beans was divided into 4 groups.
The 4 groups were then randomly assigned to be measured
(10) times each for zinc as follows: group 1 measured Raw;
group 2 measured before Blanching; group 3 measured
after Blanching; and group 4 measured after the final
processing step. Ten independent measurements were
taken from the 4 groups (treatments), yielding the following
data:
Zinc Concentration
Group 1
Group 2
Group 3
Group 4
u1 = 2.01
u2 = 2.58
u3 = 2.10
u4 = 3.05
S1 = 0.25
S2 = 0.50
S3 = 0.30
S4 = 1.00
Test this hypothesis for significance at the 5% level.
Measurements of this type are known to be Normal.
ANOVA on Single-Factor Experiment Example
The coded values for the measure of elasticity
(nt/m2) in plastic, prepared by two different
processes A & B, for samples of (6) drawn
randomly from each of the two processes are as
follows:
Group A
Group B
u1 = 7.28
u2 = 8.02
S12 = 0.48
S22 = 0.71
Do the data present sufficient evidence to
indicate a difference in mean elasticity for the two
processes at a level of significance of α = .05?
Measurements of this type are found to follow a
Normal pdf.
ANOVA on Single-Factor Experiment Several
(I) Variances (Equal Samples J)
All Normal
Null Hypothesis: H0: 21 = 22 =…= 2I
Test Statistic: 2 = (2.3026) Q / h
“Bartlett’s Test”
Alternative Hypothesis:
Ha:(at least 2 variances are not equal)
Reject Region (upper tailed test)
2  2 , I - 1
Q = I(J–1)log(MSE) – (J-1)[log(S21)+…+log(S2I)]
h=1+ 1
I –
1
3(I-1)
(J-1)
I(J–1)
ANOVA on Single-Factor Experiment Several (I)
Variances (Equal Samples J) Example
A study is designed to investigate the sulfur content
of (5) major coal seams. Eight core samples are
taken at randomly selected points within each seam.
The measured response is the S% content. Before
performing a Hypothesis Test on the data to detect
any differences that might exist in the average sulfur
content for these (5) seams, you are required to test
the condition that the (5) seams all have the same
population variance at a level of significance of .05.
The summary statistics on the sulfur content of the
(5) major coal seams follows:
Seam 1
Seam 2
Seam 3
Seam 4
Seam 5
1= 1.66 2 = 1.17 3 = 1.46 4 = 0.88 5= 1.189
S2 =.175 S2 =.144 S2 =.115 S2 =.123 S2 =.074
ANOVA on Single-Factor Experiment Several
(I) Variances (Equal Samples J) Example
Use Bartlett’s Hypothesis Test to determine
whether it is reasonable to assume
homogeneity of variances for the (4) treatment
groups in the study whether commercial
processing of various foods changes the
concentration of essential elements for
human consumption. Use  = .05.
Rough rule of thumb: If the largest s is not
much more than two times the smallest, it is
reasonable to assume equal variances.
ANOVA Multiple Comparisons
Procedures for identifying
which ui’s significantly differ
when H0 is rejected:
> Tukey
> Bonferroni
> Duncan
> Fisher LSD
> Newman-Keuls
Tukey’s T Method (Equal Samples)
1. Select  & find Q, I, I(J-1) from
Studentized Range Distribution
Table A.10 on pg. 736. (m = I)
2. Determine w = Q, I, I(J-1)*MSE/J
3. List ui’s in increasing order &
underline those pairs that differ
by less than w.
Any pair of ui’s not underscored
by the same line corresponds to
a pair of population or treatment
means that are judged
significantly different.
Examples of Tukey’s Method
Summary Results:
w = 5.37
x1
x5
x2
x3
x4
9.8 10.8 15.4 17.6 21.6
Summary Results:
x5
x3
x2
6.1 6.3 6.8
Summary Results:
x5
x3
x2
6.1 6.3 7.15
w = 0.40
x4
7.3
x1
7.5
w = 0.40
x4
7.3
x1
7.5
ANOVA Multiple Comparison Tukey’s Method
A product development engineer is interested in
maximizing the tensile strength of a new synthetic fiber.
Previous experience indicates that the strength is affected
by the % of cotton in the fiber. The engineer suspects that
increasing the cotton content will increase the strength, at
least initially. He decides to test (5) specimens at (5) levels
of cotton content. Summary data follows:
Cotton %:
15
20
25
30
35
Mean: 9.8 15.4 17.6
21.6
10.8 (psi)
s : 3.35 3.13 2.07 2.61 2.86
The Null Hypothesis H0 is rejected because the F statistic
falls in the Reject Region. The % of cotton in the fiber
significantly affects the mean tensile strength. Now use
Tukey’s T method to find significant differences among the
means. Use  = .05.
ANOVA Multiple Comparison Tukey’s Method
An experiment is developed to measure the effect that teaching
methods have on a students’ performance. The following table lists
the numerical grades on a standard arithmetic test given to 45
students divided randomly into (5) equal-sized groups. Groups 1 & 2
were taught by the current method. Groups 3, 4, & 5 were taught
together for a number of days; on each day group 3 students were
praised publicly for their previous work while group 4 students were
criticized publicly. Group 5 students while hearing the praise and
criticism of groups 3 & 4, were ignored.
Group:
1
2
3
4
5
Mean: 19.67 18.33 27.44 23.44 16.11
s2 : 17.72 12.75
6.05
9.55 13.104
Test the null hypothesis that there is no difference in the mean
grades produced by these teaching methods using  at .05. Then use
Tukey’s T method to compare & illustrate the difference in the
teaching methods.
Least Significant Difference Method
(Equal Samples
1. Select  & find t/2, I(J-1)
2. Determine w = t/2, I(J-1) *2MSE/J
3. Compare the observed difference
between each pair of averages
to the corresponding LSD.
If | ui – uJ | > LSD, we conclude
that the population mean ui and
uJ differ.
Example: Least Significant Difference Method
A manufacturer of paper used for making grocery bags is
interested in improving the tensile strength of the product.
Product engineering thinks that the tensile strength is a
function of the hardwood concentration in the pulp and that
the range of hardwood concentrations of practical interest is
between 5 and 20%. You decide to investigate (4) levels of
hardwood concentration. Six specimens at each of the (4)
concentration levels are prepared and tested on a tensile
tester in random order. The summary data from this
experiment are shown in the following table:
Hardwood (psi)
Concentration >
5%
10%
15%
20%
Mean:
10.00 15.67
17.00 21.17
S2:
8.00
7.87
3.20
6.97
Test the null hypothesis that there is no difference
in the mean tensile strength produced by these (4)
concentration levels using  at .01. Then use the
LSD method at  = .05 to compare & illustrate the
difference at each level of concentration.
Example: Least Significant Difference Method
The effective life of insulating fluids at an
accelerated load of 50 m/sec2 is being studied. Test
data have been obtained for (4) types of fluids. The
summary results for (7) trials on each fluid are as
follows:
Life (in hours) at 50 m/sec2
Fluid Type >>
1
2
3
4
Mean :
18.65 17.95
20.95
18.82
S2 :
3.81
3.44
3.53
2.42
Is there any indication that the fluids differ at a
significance level of .05?
Which fluid or fluids would you select if the
objective is long life? Use the Least Significance
Difference method with an alpha of .05 to support
you conclusion.
-Error for Single Factor ANOVA F-Test
Non-centrality parameter:
 = J  (I - )2
2

For Non-central F distribution.
With Degrees of Freedom:
1 = I-1
2 = I(J-1)
-Error for Single Factor ANOVA
1) Find the value of 2
(Experience)
2) Find the values of (i - )
3) Compute 2 using:
(Replaces ’)
2 = J  (i - )2
I 2
4) Use Power Curves (pg. 422) to
look-up power value:  = 1 – Power
> Use appropriate set curves for 1
>  (with ) is on the horizontal axis
> Move up to the curve associated
with 2
> Find value of power value on vertical
axis
-Error ANOVA Example
A product development engineer is interested in
maximizing the tensile strength of a new synthetic fiber.
Previous experience indicates that the strength is affected
by the % of cotton in the fiber. The engineer suspects that
increasing the cotton content will increase the strength, at
least initially. He decides to test (5) specimens at (5) levels
of cotton content. Summary data follows:
Cotton %:
15
20
25
30
35
Mean: 9.8 15.4 17.6
21.6
10.8 (psi)
s2 : 11.22 9.80 4.28 6.81 8.18
What is the -error if the engineer is interested in
rejecting the null hypothesis if the five treatment means
are as follows: 15 = 11 20 = 12 25 = 15 30 = 18 35 = 19
Historically, the standard deviation of tensile strength is
usually equal to 3 psi. Assume  = .01 for this test.
-Error ANOVA Example
Suppose that (5) means are being compared
in a completely randomized experiment with
 = .01. The design engineer would like to
know how many samples to take if it is
important to reject the Null Hypothesis with
probability at least 0.90 if  (i - )2 = 25 &
the population variance is known to be 5.0.
-Error ANOVA Example
Suppose that (4) Normal populations have
common variance 2 = 25 and means 1 = 50,
2 = 60, 3 = 50, and 4 = 60. How many
observations should be taken on each
population so that the probability of
rejecting the hypothesis of equality of means
is at least 0.90? Use  = 0.05.
Single-Factor ANOVA (Unequal Sample Sizes Ji)
F = MSTr
MSE
With 1 = I –1 & 2 = N-I
Where:
MSTr = SSTr
I–1
And
MSE = SSE
N -I
ANOVA Definitions
Sum of Squares Treatment:
SSTr =  Ji (i - )2
i
Sum of Square Error:
2
SSE =  (xij- i)
i j
Sum of Square Total:
SST = SSTr + SSE
Example of Unbalanced Design
Twenty-seven coins discovered in Cyprus were
grouped into (4) classes, corresponding to (4)
different coinages during the reign of King Manuel
I Comnenus (1143-1180). Archaeologists are
interested in whether there were significant
differences in the Ag content of coins minted early
and late in King Manuel’s reign. Test the H0 at  =
.01. Summary data for testing the Ag content of early
coins (group 1) to later coins (group 4) follows:
Group
Ji
Mean
SSE
SSTr
1
9
6.74%
11.02
37.75
2
7
8.24%
3
4
4.88%
4
7
5.61%
Multiple Comparisons (Unequal Samples)
Tukey’s method modified:
1. Select  & find Q, I, N-I from
Studentized Range Distribution
Table A.10 on pg. 736. (m = I)
2. Determine wij = Q, I, N-I*MSE x ( 1 + 1 )
2
Ji Jj
Uses averages of pairs 1/Ji’s instead of 1/J.
3. List ui’s in increasing order &
underline those pairs that differ
by less than wij.
Example of Multiple Comparison
(Unequal Sample Sizes)
Use Tukey’s modified T method at  = .01
to compare & illustrate the difference in the
means of Ag percentage in coins found on
Cyprus.
Group
1
2
3
4
Ji
9
7
4
7
Mean
6.74%
8.24%
4.88%
5.61%
SSE
11.02
SSTr
37.75
Download