Laura McAvinue
School of Psychology
Trinity College Dublin
• A statistical technique for testing for differences between the means of several groups
– One of the most widely used statistical tests
• T-Test
– Compare the means of two groups
• Independent samples
• Paired samples
• ANOVA
– No restriction on the number of groups
Mean
Mean
Is the mean of one group significantly different to the mean of the other group?
•t-test: H
0
1
=
2
H
1
:
1
2
Mean
Mean
Mean
Is the mean of one group significantly different to the means of the other groups?
Analysis of Variance
One way ANOVA
One Independent
Variable
Between subjects
Repeated measures /
Within subjects
Factorial ANOVA
More than One
Independent Variable
Two way
Three way
Four way
Different participants
Same participants
• Between subjects one way ANOVA
– The effect of one independent variable with three or more levels on a dependent variable
• What are the independent & dependent variables in each of the following studies?
– The effect of three drugs on reaction time
– The effect of five styles of teaching on exam results
– The effect of age (old, middle, young) on recall
– The effect of gender (male, female) on hostility
• Let’s say you have three groups and you want to see if they are significantly different…
• Recall inferential statistics
– Sample Population
• Your question:
– Are these 3 groups representative of the same population or of different populations?
Population
DV
Drug 1 Drug 2 Drug 3
µ
1
µ
2
µ
3 measure effect of manipulation on a DV
Draw 3 samples
1
2
3
Manipulate the samples
Did the manipulation alter the samples to such an extent that they now represent different populations?
Recall sampling error & the sampling distribution of the mean…
The means of samples drawn from the same population will differ a little due to random sampling error
When comparing the means of a number of groups, your task …
•Difference due to a true difference between the samples (representative of different populations)?
•Difference due to random sampling error (representative of the same population)?
If a true difference exists, this is due to your manipulation, the independent variable
1. Specify the alternative / research hypothesis
At least one mean is significantly different from the others
At least one group is representative of a separate population
2. Set up the null hypothesis
The hypothesis that all population means are equal
All groups are representative of the same population
Omnibus H o
: µ1= µ2 = µ3
3. Collect your data
4. Run the appropriate statistical test
Between subjects one way ANOVA
5. Obtain the test statistic & associated p-value
F statistic
Compare the F statistic you obtained with the distribution of F when H o is true
Determine the probability of obtaining such an F value when H o is true
6. Decide whether to reject or fail to reject H o basis of the p value on the
If the p value is very small (<.5), reject H o
…
Conclude that at least one sample mean is significantly different to the other means…
Not all groups are representative of the same population
Assume H o is true
Assume that all three groups are representative of the same population
Make two estimates of the variance of this population
If H o is true, then these two estimates should be about the same
If H o is false, these two estimates should be different
• Within group variance
• Pooled variability among participants in each treatment group
• Between group variance
• Variability among group means
If H o is true…
Between Groups Variance
Within Groups Variance
= 1
If H o is false…
Between Groups Variance
Within Groups Variance
> 1
Step…
1: Sum of squares
2: Degrees of freedom
3: Mean square
4: F ratio
5: p value
Total Variance In data
SS total
Between groups variance
SS between
Within groups
Variance
SS within
total
• ∑ (x ij
- Grand Mean ) 2
• Based on the difference between each score and the grand mean
• The sum of squared deviations of all observations, regardless of group membership, from the grand mean
between
• n∑ (Group mean j
- Grand Mean ) 2
• Based on the differences between groups
• Related to the variance of the group means
• The sum of squared deviations of the group means from the grand mean, multiplied by the number of observations in each group
within
• ∑ (x ij
- Group Mean j
) 2
• Based on the variability within each group
• Calculate SS within each group & add
• The sum of squared deviations within each group … or …
• SS total
- SS between
• Total variance
• N – 1
• Total no. of observations - 1
• Between groups variance
• K – 1
• No. of groups – 1
• Within groups variance
• k (n – 1)
• No. of groups (no. in each sample – 1)
• What’s left over!
• SS / df
• The average variance between or within groups
• An estimate of the population variance
• MS between
• SS group
/ df group
• MS within
• SS within
/ df within
If H o is true, F = 1
MS between
MS within
If H o is false, F > 1
F
MS between
MS within
Treatment effect + Differences due to chance
Differences due to chance
If treatment has no effect…
F
0 + Differences due to chance
Differences due to chance
1
If treatment has effect…
F
EFFECT > 0 + Differences due to chance
Differences due to chance
> 1
MS
BG
MS
BG
MS
BG
MS
WG MS
WG
MS
WG
Variance within groups> variance between groups
F<1
Fail to reject H o
If there is more variance within the groups, then any difference observed is due to chance
Variance within groups=
Variance between groups
F =1
Fail to reject H o
If both sources of variance are the same, then any difference observed is due to chance
Variance within groups < variance between groups
F >1
Reject H o
The more the group means differ relative to each other the more likely it is that the differences are not due to chance.
• How much greater than 1 does F have to be to reject
H o
?
• Compare the obtained F statistic to the distribution of F when H o is true
• Calculate the probability of obtaining this F value when H o
• p value is true
• If p < .05, reject H o
• Conclude that at least one of your groups is significantly different from the others
ANOVA table
MS Source of variation
Between groups
SS n∑ (Group mean j
- Grand Mean ) 2 df
K - 1 SS
BG
/ df
BG
F
MS
Between
MS
Within p
Prob. of observing
F-value when H o true is
Within groups
∑ (x ij
Mean j
- Group
) 2
K(n – 1) SS
WG
/ df
WG
Total ∑ (x ij
- Grand
Mean ) 2
N - 1
• Data in each group shou ld be…
• Interval scale
• Normally distributed
• Histograms, box plots
• Homogeneity of variance
• Variance of groups should be roughly equal
• Independence of observations
• Each person should be in only one group
• Participants should be randomly assigned to groups
• Obtain a significant F statistic
• Reject H o
& conclude that at least one sample mean is significantly different from the others
• But which one?
•
• H
1
• H
2
H
3
: µ
: µ
: µ
1
1
1
≠ µ
2
= µ
2
≠ µ
2
≠ µ
3
≠ µ
3
= µ
3
• Necessary to run a series of multiple comparisons to compare groups and see where the significant differences lie
• Making multiple comparisons leads to a higher probability of making a Type I error
• The more comparisons you make, the higher the probability of making a Type I error
• Familywise error rate
• The probability that a family of comparisons contains at least one Type I error
– familywise
= 1 - (1 -
) c c = number of comparisons
– Four comparisons run at
= .05
familywise
= 1 - (1 - .05) 4
= 1 - .8145
= .19
– You think you are working at actually working at
= .19
= .05, but you’re
• Bonferroni Procedure
•
/ c
• Divide your significance level by the number of comparisons you plan on making and use this more conservative value as your level of significance
• Four comparisons at
= .05
• .05 / 4 = .0125
• Reject H o if p < .0125
• Note: Restrict the number of comparisons to the ones you are most interested in
• Tukey
• Compares each mean with each other mean in a way that keeps the maximum familywise error rate to .05
• Computes a single value that represents the minimum difference between group means that is necessary for significance
• A statistically significant difference might not mean anything in the real world
Eta squared
2
SS between
SS total
Percentage of variability among observations that can be attributed to the differences between the groups
A little less biased…
Omega squared
2
SS between
SS
( k 1) MS within total
MS within
How big is big? Similar to correlation coefficient
Cohen’s d
When comparing two groups
Mean treat
– Mean control
SD control