False Polarization in America: Partisan Identification

advertisement
Statistical power in experiments in
which samples of participants
respond to samples of stimuli
Jake Westfall
University of Colorado Boulder
David A. Kenny
University of Connecticut
Charles M. Judd
University of Colorado Boulder
• Studies involving participants responding
to stimuli (hypothetical data matrix):
Subject #
1
2
3
.
.
.
4
6
7
3
8
8
7
9
5
6
4
7
8
4
6
9
6
7
4
5
3
6
7
4
5
7
5
8
3
4
• Just in domain of implicit prejudice and
stereotyping:
–
–
–
–
–
–
–
IAT (Greenwald et al.)
Affective Priming (Fazio et al.)
Shooter task (Correll et al.)
Affect Misattribution Procedure (Payne et al.)
Go/No-Go task (Nosek et al.)
Primed Lexical Decision task (Wittenbrink et al.)
Many non-paradigmatic studies
Hard questions
• “How many stimuli should I use?”
• “How similar or variable should the stimuli
be?”
• “When should I counterbalance the
assignment of stimuli to conditions?”
• “Is it better to have all participants respond
to the same set of stimuli, or should each
participant receive different stimuli?”
• “Should participants make multiple responses
to each stimulus, or should every response by
a participant be to a unique stimulus?”
Power analysis in crossed designs
• Power determined by several parameters:
– 1 effect size (Cohen’s d)
– 2 sample sizes
• p = # of participants
• q = # of stimuli
– Set of Variance Partitioning Coefficients (VPCs)
• VPCs describe what proportion of the random
variation in the data comes from which sources
• Different designs depend on different VPCs
Four common experimental designs
For power = 0.80,
need q ≈ 50
For power = 0.80,
need p ≈ 20
?
Maximum attainable power
• In crossed designs, power asymptotes
at a maximum theoretically attainable
value that depends on:
– Effect size
– Number of stimuli
– Stimulus variability
• Under realistic assumptions, maximum
attainable power can be quite low!
To obtain max.
power = 0.9…
Pessimist:
q=86
Realist:
q=
20 to 50
Optimist:
q=11
Implications of maximum
attainable power
• Think hard about your experimental
stimuli before you begin collecting data!
– Once data collection begins, maximum
attainable power is pretty much determined.
• Even the most optimistic assumptions
imply that we should use at least 11
stimuli per between-stimulus condition
– Based on achieving max. power = 0.9 to
detect a medium effect size (d = 0.5)
What about time-consuming
stimulus presentation?
• Assume that responses to each stimulus
take about 10 minutes (e.g., film clips).
• Power analysis says we need q=60 to
reach power=0.8 (based on having p=60)
• But then it would take over 10 hours for a
participant to respond to every stimulus!
• The highest feasible number of responses
per participant is, say, 6 (about one hour)
• Are we doomed to have low power? No!
Stimuli-within-Block designs
Standard error reduced
by factor of 2.3!
The end
URL for power app:
JakeWestfall.org/power/
Article reference:
Westfall, J., Kenny, D. A., & Judd, C. M. (in press).
Statistical Power and Optimal Design in Experiments in
Which Samples of Participants Respond to Samples of
Stimuli. Journal of Experimental Psychology: General.
Download