Treating stimuli as a random factor in social psychology

advertisement
Treating Stimuli as a Random Factor
in Social Psychology:
A New and Comprehensive Solution
to a Pervasive but Largely Ignored Problem
Jacob Westfall
University of Colorado Boulder
Charles M. Judd
University of Colorado Boulder
David A. Kenny
University of Connecticut
What to do about replicability?
• Mandatory reporting of all DVs, studies, etc.?
• Journals or journal sections devoted to straight
replication attempts?
• Pre-registration of studies?
• Many of the proposed solutions involve large-scale
institutional changes, restructuring incentives, etc.
• These are good ideas worthy of discussing, but
surely not quick or easy to implement
One way to increase replicability:
Treat stimuli as random
• Failure to account for uncertainty associated with
stimulus sampling (i.e., treating stimuli as fixed
rather than random) leads to biased, overconfident
estimates of effects (Clark, 1973; Coleman, 1964)
• The pervasive failure to model stimulus as a random
factor is probably responsible for many failures to
replicate when future studies use different stimulus
samples
Doing the correct analysis is easy!
• Recently developed statistical methods solve the
statistical problem of stimulus sampling
• These mixed models with crossed random effects
are easy to apply and are already widely available in
major statistical packages (R, SAS, SPSS, Stata, etc.)
Outline of rest of talk
1. The problem
– Illustrative design and typical RM-ANOVA analyses
– Estimated type 1 error rates
2. The solution
– Introducing mixed models with crossed random
effects for participants and stimuli
– Applications of mixed model analyses to actual
datasets
Illustrative Design
• Participants crossed with Stimuli
– Each Participant responds to each Stimulus
• Stimuli nested under Condition
– Each Stimulus always in either Condition A or Condition B
• Participants crossed with Condition
– Participants make responses under both Conditions
Sample of hypothetical dataset:
5
4
6
7
3
8
8
7
9
5
6
5
4
4
7
8
4
6
9
6
7
4
5
6
5
3
6
7
4
5
7
5
8
3
4
5
Typical repeated measures analyses (RM-ANOVA)
5
4
6
7
3
8
8
7
9
5
6
5
4
4
7
8
4
6
9
6
7
4
5
6
5
3
6
7
4
5
7
5
8
3
4
5
How variable are the stimulus ratings
around each of the participant means?
The variance is lost due to the aggregation
“By-participant analysis”
MBlack
MWhite
Difference
5.5
6.67
1.17
5.5
6.17
0.67
5.0
5.33
0.33
Typical repeated measures analyses (RM-ANOVA)
5
4
6
7
3
8
8
7
9
5
6
5
4
4
7
8
4
6
9
6
7
4
5
6
5
3
6
7
4
5
7
5
8
3
4
5
4.00 3.67 6.33 7.33 3.67 6.33 8.00 6.00 8.00 4.00 5.00 5.33
Sample 1
v.s.
Sample 2
“By-stimulus analysis”
Simulation of type 1 error rates for
typical RM-ANOVA analyses
• Design is the same as previously discussed
• Draw random samples of participants and stimuli
– Variance components = 4, Error variance = 16
• Number of participants ∈ {10, 30, 50, 70, 90}
• Number of stimuli ∈ {10, 30, 50, 70, 90}
• Conducted both by-participant and by-stimulus
analysis on each simulated dataset
• True Condition effect = 0
Type 1 error rate simulation results
• The exact simulated error rates depend on the
variance components, which although realistic,
were ultimately arbitrary
• The main points to take away here are:
1. The standard analyses will virtually always show
some degree of positive bias
2. In some (entirely realistic) cases, this bias can be
extreme
3. The degree of bias depends in a predictable way on
the design of the experiment (e.g., the sample sizes)
The old solution: Quasi-F statistics
• Although quasi-Fs successfully address the
statistical problem, they suffer from a variety of
limitations
– Require complete orthogonal design (balanced factors)
– No missing data
– No continuous covariates
– A different quasi-F must be derived (often laboriously)
for each new experimental design
– Not widely implemented in major statistical packages
The new solution: Mixed models
• Known variously as:
– Mixed-effects models, multilevel models, random
effect models, hierarchical linear models, etc.
• Most social psychologists familiar with mixed
models for hierarchical random factors
– E.g., students nested in classrooms
• Less well known is that mixed models can also
easily accommodate designs with crossed
random factors
– E.g., participants crossed with stimuli
Grand mean = 100
MeanA = -5
MeanB = 5
Participant
Intercepts
5.86
7.09
-1.09
-4.53
Stim. Intercepts: -2.84
-9.19
-1.16
18.17
Participant
Slopes
3.02
-9.09
3.15
-1.38
Everything else = residual error
The linear mixed-effects model
with crossed random effects
Fixed effects
Random effects
The linear mixed-effects model
with crossed random effects
Intercept
6 parameters
Slope
Fitting mixed models is easy: Sample syntax
R
SAS
SPSS
library(lme4)
model <- lmer(y ~ c + (1 | j) + (c | i))
proc mixed covtest;
class i j;
model y=c/solution;
random intercept c/sub=i type=un;
random intercept/sub=j;
run;
MIXED y WITH c
/FIXED=c
/PRINT=SOLUTION TESTCOV
/RANDOM=INTERCEPT c | SUBJECT(i) COVTYPE(UN)
/RANDOM=INTERCEPT | SUBJECT(j).
Mixed models successfully maintain
the nominal type 1 error rate (α = .05)
Applications to existing datasets
1. Representative simulated dataset (for
comparison)
2. Afrocentric features data (Blair et al., 2002,
2004, 2005)
3. Shooter data (Correll et al., 2002, 2007)
4. Psi / Retroactive priming data (Bem)
– Forward-priming condition (classic evaluative
priming effect)
– Reverse-priming condition (psi condition)
Comparison of effects
between RM-ANOVA and mixed model analyses
Dataset
RM-ANOVA (by-participant)
F ratio
D.F.
Simulated
example
30.48
Shooter data
p
Mixed model
Stimulus ICC
F ratio
D.F.
p
(1, 29) <.001
9.11
(1, 38.52)
.005 r = 0.191
57.89
(1, 35) <.001
3.39
(1, 48.1)
.072 r = 0.317
Afrocentric
features data
6.40
(1, 46) .015
4.33
(1, 51.1)
.043 r = 0.113
Bem (2011)
Forwardpriming
condition
22.18
(1, 98) <.001
14.59
(1, 46.91)
.029 Targets:
r = 0.349
Primes:
r = 0.035
Bem (2011)
Reversepriming
condition
6.60
(1, 98) .012
2.34
(1, 27.58)
.136 Targets:
r = 0.292
Primes:
r = 0.0
Comparison of effects
between RM-ANOVA and mixed model analyses
Dataset
RM-ANOVA (by-participant)
F ratio
D.F.
Simulated
example
30.48
Shooter data
p
Mixed model
Stimulus ICC
F ratio
D.F.
p
(1, 29) <.001
9.11
(1, 38.52)
.005 r = 0.191
57.89
(1, 35) <.001
3.39
(1, 48.1)
.072 r = 0.317
Afrocentric
features data
6.40
(1, 46) .015
4.33
(1, 51.1)
.043 r = 0.113
Bem (2011)
Forwardpriming
condition
22.18
(1, 98) <.001
14.59
(1, 46.91)
.029 Targets:
r = 0.349
Primes:
r = 0.035
Bem (2011)
Reversepriming
condition
6.60
(1, 98) .012
2.34
(1, 27.58)
.136 Targets:
r = 0.292
Primes:
r = 0.0
Comparison of effects
between RM-ANOVA and mixed model analyses
Dataset
RM-ANOVA (by-participant)
F ratio
D.F.
Simulated
example
30.48
Shooter data
p
Mixed model
Stimulus ICC
F ratio
D.F.
p
(1, 29) <.001
9.11
(1, 38.52)
.005 r = 0.191
57.89
(1, 35) <.001
3.39
(1, 48.1)
.072 r = 0.317
Afrocentric
features data
6.40
(1, 46) .015
4.33
(1, 51.1)
.043 r = 0.113
Bem (2011)
Forwardpriming
condition
22.18
(1, 98) <.001
14.59
(1, 46.91)
.029 Targets:
r = 0.349
Primes:
r = 0.035
Bem (2011)
Reversepriming
condition
6.60
(1, 98) .012
2.34
(1, 27.58)
.136 Targets:
r = 0.292
Primes:
r = 0.0
Conclusion
• Many failures of replication are probably due to
sampling stimuli and the failure to take that into
account
• Mixed models with crossed random effects allow for
generalization to future studies with different
samples of both stimuli and participants
The end
Further reading:
Judd, C. M., Westfall, J., & Kenny, D. A. (2012).
Treating stimuli as a random factor in social
psychology: A new and comprehensive solution to a
pervasive but largely ignored problem. Journal of
personality and social psychology, 103(1), 54-69.
Download