Lecture 9: One Way ANOVA Between Subjects

advertisement

Lecture 9:

One Way ANOVA

Between Subjects

Laura McAvinue

School of Psychology

Trinity College Dublin

Analysis of Variance

• A statistical technique for testing for differences between the means of several groups

– One of the most widely used statistical tests

• T-Test

– Compare the means of two groups

• Independent samples

• Paired samples

• ANOVA

– No restriction on the number of groups

T-test

Group 1

 

 

 

Mean

Group 2

 

 

 

Mean

Is the mean of one group significantly different to the mean of the other group?

t-test: H

0

1

= 

2

H

1

: 

1

 

2

Group 1

 

 

 

Mean

F-test

Group 2

 

 

 

Mean

Group 3

 

 

 

Mean

Is the mean of one group significantly different to the means of the other groups?

Analysis of Variance

One way ANOVA

One Independent

Variable

Between subjects

Repeated measures /

Within subjects

Factorial ANOVA

More than One

Independent Variable

Two way

Three way

Four way

Different participants

Same participants

A few examples…

• Between subjects one way ANOVA

– The effect of one independent variable with three or more levels on a dependent variable

• What are the independent & dependent variables in each of the following studies?

– The effect of three drugs on reaction time

– The effect of five styles of teaching on exam results

– The effect of age (old, middle, young) on recall

– The effect of gender (male, female) on hostility

Rationale

• Let’s say you have three groups and you want to see if they are significantly different…

• Recall inferential statistics

– Sample Population

• Your question:

– Are these 3 groups representative of the same population or of different populations?

Population

DV

Drug 1 Drug 2 Drug 3

µ

1

µ

2

µ

3 measure effect of manipulation on a DV

Draw 3 samples

1

2

3

Manipulate the samples

Did the manipulation alter the samples to such an extent that they now represent different populations?

Recall sampling error & the sampling distribution of the mean…

The means of samples drawn from the same population will differ a little due to random sampling error

When comparing the means of a number of groups, your task …

•Difference due to a true difference between the samples (representative of different populations)?

•Difference due to random sampling error (representative of the same population)?

If a true difference exists, this is due to your manipulation, the independent variable

Steps of NHST

1. Specify the alternative / research hypothesis

At least one mean is significantly different from the others

At least one group is representative of a separate population

2. Set up the null hypothesis

The hypothesis that all population means are equal

All groups are representative of the same population

Omnibus H o

: µ1= µ2 = µ3

Steps of NHST

3. Collect your data

4. Run the appropriate statistical test

Between subjects one way ANOVA

5. Obtain the test statistic & associated p-value

F statistic

Compare the F statistic you obtained with the distribution of F when H o is true

Determine the probability of obtaining such an F value when H o is true

Steps of NHST

6. Decide whether to reject or fail to reject H o basis of the p value on the

If the p value is very small (<.5), reject H o

Conclude that at least one sample mean is significantly different to the other means…

Not all groups are representative of the same population

How is ANOVA done?

 Assume H o is true

 Assume that all three groups are representative of the same population

 Make two estimates of the variance of this population

 If H o is true, then these two estimates should be about the same

 If H o is false, these two estimates should be different

Two estimates of population variance

• Within group variance

• Pooled variability among participants in each treatment group

• Between group variance

• Variability among group means

If H o is true…

Between Groups Variance

Within Groups Variance

= 1

If H o is false…

Between Groups Variance

Within Groups Variance

> 1

Calculations

 Step…

 1: Sum of squares

 2: Degrees of freedom

 3: Mean square

 4: F ratio

 5: p value

Total Variance In data

SS total

Between groups variance

SS between

Within groups

Variance

SS within

SS

total

• ∑ (x ij

- Grand Mean ) 2

• Based on the difference between each score and the grand mean

• The sum of squared deviations of all observations, regardless of group membership, from the grand mean

SS

between

• n∑ (Group mean j

- Grand Mean ) 2

• Based on the differences between groups

• Related to the variance of the group means

• The sum of squared deviations of the group means from the grand mean, multiplied by the number of observations in each group

SS

within

• ∑ (x ij

- Group Mean j

) 2

• Based on the variability within each group

• Calculate SS within each group & add

• The sum of squared deviations within each group … or …

• SS total

- SS between

Degrees of Freedom

• Total variance

• N – 1

• Total no. of observations - 1

• Between groups variance

• K – 1

• No. of groups – 1

• Within groups variance

• k (n – 1)

• No. of groups (no. in each sample – 1)

• What’s left over!

Mean Square

• SS / df

• The average variance between or within groups

• An estimate of the population variance

• MS between

• SS group

/ df group

• MS within

• SS within

/ df within

If H o is true, F = 1

F Ratio

MS between

MS within

If H o is false, F > 1

F

MS between

MS within

Treatment effect + Differences due to chance

Differences due to chance

If treatment has no effect…

F

0 + Differences due to chance

Differences due to chance

1

If treatment has effect…

F

EFFECT > 0 + Differences due to chance

Differences due to chance

> 1

MS

BG

MS

BG

MS

BG

MS

WG MS

WG

MS

WG

Variance within groups> variance between groups

F<1

Fail to reject H o

If there is more variance within the groups, then any difference observed is due to chance

Variance within groups=

Variance between groups

F =1

Fail to reject H o

If both sources of variance are the same, then any difference observed is due to chance

Variance within groups < variance between groups

F >1

Reject H o

The more the group means differ relative to each other the more likely it is that the differences are not due to chance.

Size of F

• How much greater than 1 does F have to be to reject

H o

?

• Compare the obtained F statistic to the distribution of F when H o is true

• Calculate the probability of obtaining this F value when H o

• p value is true

• If p < .05, reject H o

• Conclude that at least one of your groups is significantly different from the others

ANOVA table

MS Source of variation

Between groups

SS n∑ (Group mean j

- Grand Mean ) 2 df

K - 1 SS

BG

/ df

BG

F

MS

Between

MS

Within p

Prob. of observing

F-value when H o true is

Within groups

∑ (x ij

Mean j

- Group

) 2

K(n – 1) SS

WG

/ df

WG

Total ∑ (x ij

- Grand

Mean ) 2

N - 1

A few assumptions…

• Data in each group shou ld be…

• Interval scale

• Normally distributed

• Histograms, box plots

• Homogeneity of variance

• Variance of groups should be roughly equal

• Independence of observations

• Each person should be in only one group

• Participants should be randomly assigned to groups

Multiple Comparison Procedures

• Obtain a significant F statistic

• Reject H o

& conclude that at least one sample mean is significantly different from the others

• But which one?

• H

1

• H

2

H

3

: µ

: µ

: µ

1

1

1

≠ µ

2

= µ

2

≠ µ

2

≠ µ

3

≠ µ

3

= µ

3

• Necessary to run a series of multiple comparisons to compare groups and see where the significant differences lie

Problem with Multiple Comparisons

• Making multiple comparisons leads to a higher probability of making a Type I error

• The more comparisons you make, the higher the probability of making a Type I error

• Familywise error rate

• The probability that a family of comparisons contains at least one Type I error

Problem with Multiple Comparisons

–  familywise

= 1 - (1 -

) c c = number of comparisons

– Four comparisons run at

= .05

 familywise

= 1 - (1 - .05) 4

= 1 - .8145

= .19

– You think you are working at actually working at

= .19

 = .05, but you’re

Post hoc tests

• Bonferroni Procedure

• 

/ c

• Divide your significance level by the number of comparisons you plan on making and use this more conservative value as your level of significance

• Four comparisons at

= .05

• .05 / 4 = .0125

• Reject H o if p < .0125

Post hoc tests

• Note: Restrict the number of comparisons to the ones you are most interested in

• Tukey

• Compares each mean with each other mean in a way that keeps the maximum familywise error rate to .05

• Computes a single value that represents the minimum difference between group means that is necessary for significance

Effect Size

• A statistically significant difference might not mean anything in the real world

Eta squared

 2 

SS between

SS total

Percentage of variability among observations that can be attributed to the differences between the groups

A little less biased…

Omega squared

 2 

SS between

SS

 ( k  1) MS within total

 MS within

How big is big? Similar to correlation coefficient

Cohen’s d

When comparing two groups

Mean treat

– Mean control

SD control

Download