Analysis of Variance

Analysis of Variance Notes
Experiments versus Studies
Types of Experiments
Assumptions & Assumption Checks
Types of Analysis
1. Experiments versus Studies
1.1 Terminology
Factors versus Independent Variables
Example: hours studied and major are two factors affecting Grade
Treatments –
Example: specific combinations of hours studied and teaching method
1.2 Purpose
Observational Study –
o Correlational –
o Observe values of X
Experiment –
o Cause-Effect
o Control values of X
o Balanced
o unbalanced
2. Types of Experimental Designs
2.1 Randomized Design
one factor
two factor
2.2 Randomized Block Design
2.3 Examples
Teaching Method only
Teaching method and hours studied
Teaching method within major
3. Assumptions & Assumption Checks
3.1 Assumptions
Same Variance
3.2 Assumption checks
Modified Levine – comparing differences to center
Normality Tests and Box Plots
4. Analysis
Sources of Variability and degrees of freedom
Tests of effects of
o One factor designs
o Each factor in two factor designs
o Combination effects in two factor designs
Tests of Assumptions
Tests and estimation of differences in averages
4.1 Sources of variability and degrees of freedom
 Total: Values around overall average: divisor of (n-1)
Factor: Factor averages variation : divisor of (# averages – 1)
Interaction: Combination effects: divisor of (product of factor divisors)
Error: Randomness: divisor of (n - # of averages or combination of averages)
4.2 Tests of effects of
4.2.1 One Factor – Completely Randomized Design or Independent Sample Study Test Template:
Null hypothesis: average value of Y is the same for all levels of the factor
Alternative: at least two are different
Test Statistic: Compares variation of factor averages to variation of
random data
Among-Group variation to within-group variation
Rejection Region: Above ratio is large (F ratio) > F table
Two degrees of freedom: numerator degrees of freedom and divisor
degrees of freedom
Conclusion: We can (not) say the average value of Y differs for at least
two levels of the factor. Example: Y = tensile strength of a product
Factor = 4 Suppliers
Obtain samples of size 5 from each supplier (n = ____ )
MSA = sample factor variability = 21.095
MSW = sample error variability = 6.094
Null hypothesis: 1=2=3=4 (average value of ______________ is the
Same for all ________________)
Alternative: at least two are different
Test Statistic: MSA/MSW =
Rejection Region: Reject Ho if F > F table with
Numerator degrees of freedom = ______ and denomination d.f. = _____
F-Table = _______
Conclusion: We can (not) say that the average _________________
differs for at least two ________________________
4.2.2 One Factor – Randomized Block Test Template:
Same as in but divisor degrees of freedom =
(# of factor means-1)*( # of block means-1) Example: Y = Rating of a restaurant’s service
Factor = 4 Restaurants
Block = all restaurants reviewed by same 6 raters (n = ____ )
MSA = sample factor variability = 595.8
MSE = sample error variability = 14.986
Null hypothesis: 1=2=3=4 (average value of ______________ is the
Same for all ________________)
Alternative: at least two are different
Test Statistic: MSA/MSW =
Rejection Region: Reject Ho if F > F table with
Numerator degrees of freedom = ______ and denomination d.f. = _____
F-Table = _______
Conclusion: We can (not) say that the average _________________
differs for at least two ________________________
4.2.3 Two Factors – Interaction or combination effects Test Template:
Null hypothesis: (no interaction) difference in average value of Y between
any two levels of factor one does not depend on the level of factor two
Alternative: (interaction) difference in average value of Y between any
two levels of factor one does depend on the level of factor two
Test Statistic: Compares variation of interaction to variation of random
Among-Group variation to within-group variation
Rejection Region: Above ratio is large (F ratio) > F table
Two degrees of freedom:
numerator d.f. = product of factor 1 and 2 d.f.
denominator = n – number of combination of factor 1 and 2
Conclusion: we can (not) say that the difference in average value of Y
between any two levels of factor one does depend on the level of factor
two. Example: Y = length of a ball-bearings life
Factor 1 = heat treatment (high or low)
Factor 2 = ring osculation (high or low)
Obtain samples of size 2 from each combination (n = ____ )
MSAB = sample interaction variability = 3280.5
MSE = sample error variability = 61
Null hypothesis: (no interaction) difference in average value of
_______________ between any two levels of ________________ does
not depend on the level of ___________________
Alternative: (interaction) difference in average value of _______________
between any two levels of ________________ does depend on the level of
Test Statistic: Compares variation of interaction to variation of random
Rejection Region: Above ratio is large (F ratio) > F table
Two degrees of freedom:
numerator d.f. = product of factor 1 and 2 d.f = .
denominator = n – number of combination of factor 1 and 2 =
Conclusion: we can (not) say that the difference in average value of
_____________________ between any two levels of
____________________ does depend on the level of _______________.
4.2.4 One of the two factors – Completely Randomized Design or Independent
Sample Study – NO SIGNIFICANT INTERACTION Test Template:
same as in the one-factor test but
divisor d.f. = n – (#of levels of factor 1)*(# in factor 2) Example: Y = rating of a photographic plate
Factor A = 2 levels of development strength,
Factor B = 2 levels of development time (10 and 14 minutes)
Randomly assign 4 plates to each of the 4 combinations
MSA = sample variability of factor A (time) = 1.5625
MSB = sample variability of factor B (strength) = 56.5625
MSE = sample error variability = 2.229
(no interaction was found – testing time effect)
Null hypothesis: 1=2 (average value of ______________ is the
Same for all ________________)
Alternative: at least two are different
Test Statistic: MSB/MSE =
Rejection Region: Reject Ho if F > F table with
Numerator degrees of freedom = ______ and denomination d.f. = _____
F-Table = _______
Conclusion: We can (not) say that the average _________________
differs for at least two ________________________
4.3 Tests of Assumptions
4.3.1 Equal Variance – Test Template:
Null hypothesis: variation of Y is the same for all levels of the factor
Alternative: at least two are different
Compute the absolute difference between each value in a group and the
median of the group
Test Statistic and rejection region: same as for the factor tests
Conclusion: We can (not) say the variation of Y differs for at least two
levels of the factor. Example: Y = tensile strength of a product
Factor = 4 Suppliers
Obtain samples of size 5 from each supplier (n = ____ )
MSDifference = sample factor variability = 0.59
MSE = sample error variability = 2.2853
Null hypothesis: 1=2=3=4 (variability of __________ is the
Same for all ________________)
Alternative: at least two are different
Test Statistic: MSDiff/MSE =
Rejection Region: Reject Ho if F > F table with
Numerator degrees of freedom = ______ and denomination d.f. = _____
F-Table = _______
Conclusion: We can (not) say that the variability of _________________
differs for at least two ________________________
4.3.2 Normality – Test Template:
Null hypothesis: distribution of Y is the normal for all levels of the factor
Alternative: at least one is not normal
Test Statistic and rejection region: use tests on NCSS and p-value is less
than alpha reject normality.
Conclusion: We can (not) say the distribution of Y is not normal for at
least two levels of the factor. Example: Y = tensile strength of a product
Factor = 4 Suppliers
Obtain samples of size 5 from each supplier (n = ____ )
Assumption Test
Skewness Normality of Residuals
Kurtosis Normality of Residuals
Omnibus Normality of Residuals
Prob -Level
Null hypothesis: distribution of __________ is normality distributed for
all ________________)
Alternative: distribution of __________ is non-normally distributed for at
least one level of ________________)
Test Statistic: p-value
Rejection region: p-value < alpha
Conclusion: We can (not) say that the distribution of __________ is nonnormally distributed for at least one level of ________________)
4.4 Testing the difference in means
4.4.1 Expermentwise error versus comparison error
4.4.2 Testing one factor
Use NCSS. The output will tell you which means are statistically different
Example: Y = tensile strength of a product
Factor = 4 Suppliers
Obtain samples of size 5 from each supplier
Tukey-Kramer Multiple-Comparison Test
Response: strength
Term A: supplier
Alpha=0.050 Error Term=S(A) DF=16 MSE=6.094 Critical
Count Mean
From Groups
Conclusions: We can say that the average value of (Y) _________ for (factor
level) ________ differs from (factor level).
<Repeat for each difference>
The average (Y) for the other (factor levels) ______________ are not
significantly different.
4.4.3 Same procedure works for Randomized Block and Two-factor studies
without interaction.
4.5 Nonparametric tests
4.5.1 Kruskal-Wallis test
One-factor designs
Compares medians instead of means
Test similar to ANOVA but does not require normality
Using NCSS: p-value < alpha reject equality of medians
4.5.2 Friedman’s Test
Randomized block designs
Compares medians instead of means
Test similar to ANOVA but does not require normality
Using NCSS p-value < alpha reject equality of medians
5.1 data format: place all the values of Y in one column and let the next column(s) be the
values of the factor(s).
5.2 Approach
5.2.1 One factor designs
Click on Analysis, ANOVA, one-way anova
Choose the dependent variable and factor
In reports, uncheck EMS report and check Tukey-Kramer Test
5.2.2 Randomized Block Designs
Click on Analysis, ANOVA, Analysis of Variance
o First, the dependent variable
o Second, for factor 1 the block and choose Random from Type-list
o Third, for factor 2 the factor of interest, (fixed type)
In reports, uncheck EMS report and check Tukey-Kramer Test
5.2.3 Two-factor designs
Click on Analysis, ANOVA, Analysis of Variance
o First, the dependent variable
o Second, factor 1 Type Fixed
o Third, factor 2 Type fixed
o If interaction exists, tests for two-factor interaction
In reports, uncheck EMS report and check Tukey-Kramer Test