Between

advertisement
Oneway/Randomized Block Designs
Q560: Experimental Methods in Cognitive Science
Lecture 8
Reconstructive
Memory:
Loftus and Palmer (1974)
“How fast were the cars going when they ____
each other?” (hit, bumped, smashed)
44
Estimated speed (MPH)
42
40
38
36
34
32
30
Hit
Bumped
Verd used in question
Smashed
The problem with t-tests…
We could compare three groups with multiple ttests: M1 vs. M2, M1 vs. M3, M2 vs. M3
But this causes our chance of a type I error
(alpha) to compound with each test we do
Testwise Error: Probability of a type I error on
any one statistical test
Experimentwise Error: Probability of a type I
error over all statistical tests in an experiment
ANOVA keeps our experimentwise error = alpha
What is ANOVA?
In ANOVA an independent or quasi-independent
variable is called a factor.
Factor = independent (or quasi-independent)
variable.
Levels = number of values used for the
independent variable.
One factor  “single-factor design”
More than one factor  “factorial design”
An example of a single-factor design:
A example of a two-factor design:
What are we interested in?
Two interpretations:
1) Differences are due to chance.
2) Differences are real.
ANOVA Test Stastic
Remember the t statistic:
actual difference between sample means
t=
difference expected by chance
ANOVA test statistic (F-ratio) is similar:
actual variance between sample means
F=
variance expected by chance
Variance can be calculated for more
than two sample means …
An example:
The Logic of ANOVA
Hypothetical data from an experiment examining
learning performance under three temperature
conditions:
Treatment 1
50°
(sample 1)
Treatment 2
70°
(sample 2)
Treatment 3
90°
(sample 3)
The Logic of ANOVA
Looking at the data, there are two kinds of
variability (variance):
-Between treatments
-Within treatments
Variance between treatments can have two
interpretations:
-Variance is due to differences between treatments.
-Variance is due to chance alone. This may be due
to individual differences or experimental error.
The Logic of ANOVA
The Logic of ANOVA
F-ratio compares between and within variance as
follows:
F=
variance between treatments
variance within treatments
Another way of expressing it:
treatment effect + chance
F=
chance
The Logic of ANOVA
treatment effect + chance
F=
chance
If there is no effect due to treatment:
F  1.00.
If there is a significant effect due to treatment:
F > 1.00.
The denominator of the F-ratio is also called the
error term (it measures only unsystematic
variance).
ANOVA Notation
ANOVA Notation
What do all the letters mean?
k = number of levels of the factor (i.e. number of
treatments)
n = number of scores in each treatment
N = total number of scores in the entire study
T = X for each treatment condition
G = “grand total” of all the scores
We also need SS, M, and X2.
ANOVA Notation
What are the calculations we need to do?
4)
1)
3)
2)
1) Analysis of Sum of Squares:
Total:
Between:
Within:
SStotal
G2
= X2 - N
SSbetween
T2
G2
=
n
N
SSwithin = SSinside each treatment
Note: SStotal = SSwithin + SSbetween.
1) Analysis of Sum of Squares:
Total:
Between:
Within:
SStotal
G2
= X2 - N
SSbetween
T2
G2
=
n
N
SSwithin = SSinside each treatment
Just Remember: SSwithin = SStotal - SSbetween.
Note: SStotal = SSwithin + SSbetween.
2) Analysis of Degrees of Freedom:
Total:
Between:
Within:
dftotal = N-1
dfbetween = k-1
dfwithin = dfin each treatment = N - k
Note: dftotal = dfwithin + dfbetween.
3) Calculation of Variances (MS) and of F-ratio:
Note: In ANOVA, variance = mean square (MS)
MSbetween
MSwithin
SSbetween
=
dfbetween
SSwithin
=
dfwithin
MSbetween
F=
MSwithin
Summary of ANOVA data:
F distribution
In our example, the value for the F-ratio is high
(11.28). Is this value really significant?
Need to compare this value to the overall F
distribution.
Note:
1) F-ratios must be positive.
2) If H0 is true, F is around 1.00.
3) Exact shape of F distribution will depend on
the values for df.
F distribution
Shape of the F distribution for df = 2, 12:
F distribution
Let’s take a look at an F distribution table:
degrees of freedom:
denominator
degrees of freedom: numerator
1
2
3
4
5
6
Hypothesis Testing with ANOVA
Step 1: Hypotheses
• H0: all equal; H1: at least one is different
Step 2: Determine critical value
• F ratios are all positive (only one tail)
• Need: dfB and dfW
Step 3: Calculations
• SSB and SSW
• MSB and MSW
•F
Step 4: Decision and conclusions
• And maybe a source table
Hypothesis Testing with ANOVA
Data for three drugs designed to act as pain
relievers:
Placebo
Drug A
Drug B
Drug C
Step 1: State hypotheses
H0: 1 = 2 = 3 = 4.
H1: At least one  is different.
Step 2: Determine the critical region
Set =.05
Determine df.
dftotal = N-1 = 20-1= 19
dfbetween = k-1 = 4-1 = 3
dfwithin = N-k = 20-4 = 16
For the data given in the example:
df = 3,8
Step 3: Calculate the F-ratio for the data
1) Obtain SSbetween and SSwithin.
2) Use SS and df values to calculate the two
variances MSbetween and MSwithin.
3) Finally, use the two MS values to compute Fratio.
1) Sum of Squares
Total:
Between:
SSTotal
G2
60 2
= SX = 262 = 82
N
20
SSBetween
2
T 2 G2
=å n
N
5 2 10 2 20 2 25 2 60 2
= +
+
+
= 50
5
5
5
5
20
Within:
SSBetween = SSTotal - SSBetween
= 82 - 50 = 32
2) Mean Squares
Between:
Within:
MSBetween
SSBetween 50
=
=
= 16.67
df Between
3
MSWithin
SSWithin 32
=
=
= 2.00
dfWithin 16
3) F-Ratio
MSBetween 16.67
F=
=
= 8.33
MSWithin
2.00
Step 4: Decision and Conclusion
Fobt exceeds Fcrit
 Reject H0
We must reject the null hypothesis that all of the
drugs are the same, F(3,16) = 8.33, p< .05
Summary Table:
Source
SS
df
MS
F
Between
Within
Total
50
32
82
3
16
19
16.67
2.00
8.33*
Source Table for
Independent-Measures ANOVA
Source
Between
Within
Total
SS
df
T 2 G2
SSB = å n
N
SSW = SST - SSB
SST = SX 2 -
G2
N
k-1
N-k
N-1
MS
MSbetween =
SSbetween
df between
MSwithin =
SSwithin
df within
F
F=
MSbetween
MSwithin
Let’s visualize the concepts of betweentreatment and within-treatment variability.
What are the corresponding F-ratios?
MSbetween
F=
MSwithin
Experiment A:
F = 56/0.667 = 83.96
Experiment B:
F = 56/40.33 = 1.39
Randomized Block Designs
The Logic of ANOVA
treatment effect + chance
F=
chance
If there is no effect due to treatment:
F  1.00.
If there is a significant effect due to treatment:
F > 1.00.
The denominator of the F-ratio is also called the
error term (it measures only unsystematic
variance).
Two Types of ANOVA
Independent measures design: Groups are
samples of independent measurements (different
people)
Dependent measures design: Groups are samples
of dependent measurements (usually same people
at different times; also matched samples)
“Repeated measures”
With t-tests, we used different formulae depending
on the design…this is also true of ANOVA
The Logic of ANOVA
Independent Measures
Differences between groups could be due to
• Treatment effect
• Individual differences
• Error or chance (tired, hungry, etc)
Differences within groups could be due to
• Individual differences
• Error or chance
treatment effect + indiv diffs + chance
F=
indiv diffs + chance
A repeated-measures design removes variability due to individual
differences, and gives us a more powerful test
Repeated Measures
In a repeated-measures design, the same people
are tested in each treatment, so differences
between treatment groups cannot be due to
individual differences
treatment effect + indiv diffs + chance
F=
indiv diffs + chance
So, we need to estimate differences between
individuals to remove it from the denominator
Then we will have a purer measure of the actual
treatment effect (if it exists)
Partitioning of Variance/df
Total Variance
Between-treatments
variance:
Within-treatments
variance:
1) Treatment effect
1) Individual diffs
2) Error or chance
(excluding indiv
differences)
2) Error or chance
Between-subjects
variance:
1) Individual diffs
Error variance:
1) Error or chance
(excluding indiv
differences)
Example: Number of errors on a typing task while
coffee is consumed
Person Baseline Time 1 Time 2 Time 3
A
B
C
D
E
3
0
2
0
0
T=5
SS=8
4
3
1
1
1
6
3
4
3
4
7
6
5
4
3
Person
Totals
P = 20
P = 12
P = 12
P=8
P=8
n=5
k=4
N = 20
G = 60
SX 2 = 262
T = 10 T = 20 T = 25
SS=8 SS=6 SS=10
We also compute person totals to get
an estimate of individual differences
Sum of Squares: Stage 1
First step is identical to independent-measures ANOVA
Total:
Between:
SStotal
G2
=  X2 - N
SSbetween
T2
G2
=
n
N
Within: SSwithin = SStotal - SSbetween
dfbetween = k-1
dftotal = N-1
dfwithin = N-k
Sum of Squares: Stage 2
In the second stage, we simply remove the individual
differences from the denominator of the F-ratio
P 2 G2
SSb / s = å k
N
SSerror = SSwithin - SSb / s
df b / s = n -1
df error = df within - df b / s
Mean Squares and F-ratio
Now we just substitute MSerror into the denominator of F:
MSbetween
SSbetween
=
df between
MSbetween
F=
MSerror
MSerror
SSerror
=
df error
Source Table for
Repeated-Measures ANOVA
Source
Between
Within
b/w subjects
Error
Total
SS
T 2 G2
SSB = å n
N
SSW = SST - SSB
SSb / s = å
P 2 G2
k
N
SSerror = SSwithin - SSb / s
G2
SST = SX N
2
df
k-1
N-k
n-1
(N-k)-(n-1)
N-1
MS
MSbetween =
SSbetween
df between
MSerror =
SSerror
df error
F
F=
MSbetween
MSerror
Example: Number of errors on a typing task while
coffee is consumed
Person Baseline Time 1 Time 2 Time 3
A
B
C
D
E
3
0
2
0
0
T=5
SS=8
4
3
1
1
1
6
3
4
3
4
7
6
5
4
3
T = 10 T = 20 T = 25
SS=8 SS=6 SS=10
Person
Totals
P = 20
P = 12
P = 12
P=8
P=8
n=5
K=4
N = 20
G = 60
SX 2 = 262
Step 1: State hypotheses
H0: 1 = 2 = 3 = 4.
H1: At least one  is different.
Step 2: Determine the critical region
Set =.05
Determine df.
Fcrit(3,12)=3.49
dftotal = N-1 = 20-1= 19
dfbetween = k-1 = 4-1 = 3
dfwithin = N-k = 20-4 = 16
dfb/s = n-1 = 5-1 = 4
dferror = dfwithin- dfb/s=16-4 = 12
Step 3: Calculate the F-ratio for the data
1) Obtain SSbetween and SSerror.
2) Use SS and df values to calculate the two
variances MSbetween and MSerror.
3) Finally, use the two MS values to compute Fratio.
1) Sum of Squares, Stage 1:
Total:
Between:
SSTotal
G2
60 2
= SX = 262 = 82
N
20
SSBetween
2
T 2 G2
=å n
N
5 2 10 2 20 2 25 2 60 2
= +
+
+
= 50
5
5
5
5
20
Within:
SSWithin = SSTotal - SSBetween
= 82 - 50 = 32
1) Sum of Squares, Stage 2:
Between subjects:
P 2 G2
20 2 12 2 12 2 82 82 60 2
SSb / s = å =
+
+
+ + = 24
k
N
4
4
4
4 4 20
Error:
SSerror = SSwithin - SSb / s = 32 - 24 = 8
2) Mean Squares
Between:
Error:
3) F-Ratio
MSBetween
SSBetween 50
=
=
= 16.67
df Between
3
MSError
SSError 8
=
= = 0.67
df Error 12
MSBetween 16.67
F=
=
= 24.88
MSError
0.67
Note: These are the same data we used in the independentmeasures ANOVA on Thurs, by changing from an independent
to repeated measures design, weve gone from F=8.33 to
F=24.88
Step 4: Decision and Conclusion
Fobt exceeds Fcrit
 Reject H0
We must reject the null hypothesis that coffee has
no effect on errors, F(3,12) = 24.88, p< .05
Summary Table:
Source
SS
df
MS
F
Between
Within
b/w Ss
Error
Total
50
32
24
8
82
3
16
4
12
19
16.67
24.88*
0.67
Advantages of Repeated-Measures
Remember: variance (=“noise”) in the samples
increases the estimated standard error and makes
it harder for a treatment-related effect to be
detected. (Remember how we added up two
sources of variance in the independent-measures
design.)
Repeated-measures design reduces or limits the
variance, by eliminating the individual differences
between samples.
Problems with Repeated Measures
Carryover effect (specifically associated with
repeated-measures design): subject’s score in
second measurement is altered by a lingering
aftereffect from the first measurement.
Examples: testing of two drugs in succession,
motivation effects, etc.
Important: Aftereffect from first treatment
Problems with Repeated Measures
Progressive error: Subject’s score changes over
time due to a consistent (systematic) effect.
Examples: fatigue, practice
Important: effect of time alone
History: changes outside the individual that
may be confounded w/ the treatment
Maturation: changes within the individual that
may be confounded w/ the treatment
Download