Analyses of Reaction Times

advertisement
DATA ANALYSIS
MARKUS BRAUER
Analyses of Reaction Time Data
A. General Issues:
- only read in data that are used in the later analyses; in most cases, the output file contains many
dependent variables that are of no interest
- give the variables variable names that end in numbers (e.g., rt1, rt2, …, rt60); this procedure
allows the researcher to work with vectors (SPSS) or arrays (SAS) later on when groups of
variables have to be recoded into new variables; the particular meaning of the numbers should
be specified in the codebook
- in SPSS, it is necessary to sort the variables in the datafile (using the SAVE OUTFILE …
/KEEP command) so that variables with the same names are next to each other in the data file
B. Delete participants who are outliers:
- create an average error score for each participant; if all correct responses have one value and all
incorrect responses another value, it is possible to simply average across all responses; if if the
correct response is sometimes one button ("Good") and sometimes the other ("Bad"), it is
necessary to first create new response variables where all correct responses have one value and
all incorrect responses another value
- create an average RT score for each participant by averaging across all RT's
- get univariate statistics for both average scores and plot them (histogram and/or stem-and-leaf)
- have the statistics software list the values of these two variables for all participants
- if one participant is suspect check in detail what is going on: (a) Was s/he obsessed by accuracy
(which would result in exceptionally slow responses and an error rate of close to 0%)? (b) Was
s/he obsessed by speed (which would result in exceptionally fast responses and a very high error
rate) (c) Is his/her exceptionally short/long average response time due to small number of
responses or was there a general tendency for him/her to respond fast/slowly? (d) Is there a
problem with the data file, i.e., is his/her exceptionally short/long average response time due to
many "weird" values (e.g., 0000, 9999) which are clearly failures of the computer program?
C. Find the appropriate transformation:
- data transformation is necessary because RT data are virtually always positively skewed; given
that a normal distribution is one of ANOVA's assumptions one can only trust the results of the
inferential statistics it the one has removed the positive skew prior to the analyses (removing the
skew also increases statistical power); deletion of individual outlier responses is necessary
because it is quite common that the participants either "fall asleep" or press the button box
instead of the button on some trials; the issue of data transformation and the deletion of
2
individual outlier responses is tricky one because both are related; transforming data changes
some outliers into valid observations and deleting outliers affects the skew; the basic idea is to
chose the appropriate transformation for the observations that are clearly no outliers then to do
the outlier analysis on the transformed data that include all observations
- create a new data set (e.g., "newdata") where each response latency corresponds to a different
observation (e.g., if 40 participants each produced 60 responses, the new data set has 2400
observations); the new data set should contain the following variables: observation number
("idrl", here: from 1 to 2400), participant number ("id", here: from 1 to 40), response number
("respnumb", here: from 1 to 60), the response ("resp", in its new form, i.e., "correct" or
"incorrect", probably coded 0 and 1), the response latency ("rl", in msec)
- sort the data based on response latency; in SAS, have the stats software list the smallest 5% and
the largest 10% of the observations (here: the first 120 and the last 240 observations)
- check for response latencies that are clearly due to failures of the computer program (e.g., 0000
and 9999) and change them into missing values; check for other inconsistencies
- do univariate statistics on the response latencies to get the mean, the standard deviation and
observations at the extremes of the distribution
- create a new response latency variable (e.g., "nrl") which corresponds to the old response
latency variable ("rl")
- in the new response latency variable, change the smallest 1% of the observations into missing
values (here: the fastest 24 response latencies); if the cut-off point is below 300 msec change all
observations below 300 msec into missing values
- change also the largest 5% of the observations into missing values (here: the slowest 120
response latencies); if the cut-off point is more than 4 standard deviations above the mean
change all observations higher than 4 standard deviations above the mean into missing values
- create several new variables that correspond to the new response latency variable ("nrl") but that
have been subjected to different transformations; the most common transformations are:
a) nv1 = (ov).5 =
ov
(square root transformation)
b) nv2 = ln (ov)
-.5
c) nv3 = (ov)
(logarithmic transformation, natural logarithm)
d) nv4 = (ov)-1
= 1/ ov
= 1/ov
-2
2
e) nv5 = (ov)
= 1/(ov)
(square root reciprocal transformation)
(reciprocal transformation)
(squared reciprocal transformation)
- in order to maintain the rank ordering of observations and for easier handling of the data it is
suggested to linearly rescale the transformations (which does not affect their distributional
properties):
a) nv1 = (ov).5
( COMPUTE
nrlt1 = nrl**.5 .
)
( COMPUTE
nrlt2 = LN(nrl) .
)
c) nv3 = (((ov) )*-1000)+60
( COMPUTE
nrlt3 = ((nrl**-.5)*-1000)+60 .
)
d) nv4 = (((ov)-1)*-10000)+40
( COMPUTE
nrlt4 = ((nrl**-1)*-10000)+40 .
)
( COMPUTE
nrlt5 = ((nrl**-2)*-1000000)+20 .
b) nv2 = ln (ov)
-.5
-2
e) nv5 = (((ov) )*-1000000)+20
)
3
- do univariate statistics and plot the newly created variables (histogram and/or stem-and-leaf) in
order to find out which of the transformations yields the most normal distribution; decide on a
distribution that has a slight positive skew
4
D. Delete observations that are outliers:
- create yet another new response latency variable ("newrl") that corresponds to the original
response latency variable ("rl"), apply the chosen transformation, and do outlier analysis, i.e.,
change observations that are more than 3 standard deviations above and below the mean into
missing values; if the mean is used as the indicator for central tendency within cells (see below),
it is advised to apply a stricter exclusion criterion, e.g. to exclude all observations that are more
than 2 standard deviations above and below the mean
- calculate the cut-off points and retransform them into the millisecond metric so that one can find
out the true value of the cut-off points; check if these values "make sense": the lower cut-off
point should be between 300 and 500 msec for fast tasks (word pronunciation, lexical decision)
and between 400 and 700 msec for slow tasks (adjective evaluation, category inclusion); the
upper limit should be between 1500 and 3000 msec for fast tasks and between 3000 and 6000
msec for slow tasks
- if there is a between-participants variable of theoretical interest check if the response latencies
were affected by the experimental manipulation; if there is a doubt do a t-test
E. Check the error rates:
- create a new response variable ("newresp") that corresponds to the old response variable
("resp"), and change all observations that have been considered outliers in the previous step into
missing values; then, calculate the overall error rate; error rates of 0% and error rates of 10%
and more are suspect
- if there is a between-participants variable of theoretical interest check if the error rate is more or
less the same in the different experimental conditions; if there is a doubt do a chi-square test
- create the "ultimate" response latency variable ("ultimrl") that corresponds to the previous
response latency variable ("newresp") change all observations with incorrect responses into
missing values
- at the end, check how many invalid observations (i.e., missing values) there are in the data set;
calculate the percentage (which should be reported in the article); if 5% or less of the RT's have
been deleted, everything is OK; if 10% or more of the observations have missing values there is
a serious problem; if only very few observations (1% or less) have been changed to missing
values consider adopting a stricter criterion for the exclusion of outliers
F. Calculating the dependent variables:
- go back to the original data set and change all RT's that are due to failures of the computer
program, that are outliers, or that correspond to incorrect responses into missing values (work
with arrays/vectors and conditional expressions!!)
- apply the appropriate transformation
- calculate the median response time for each of the within-participants experimental conditions
(e.g., male primes, positive target adjectives); the median is considerably better than the mean
which is used by most researchers because the median is less affected by outliers; many of the
5
checking described above and below can be handled lightly if the median is used as the
dependent variable; if the researcher decides to use the mean instead of the median, s/he should
devote considerable effort to detecting outliers, especially if there are less than 8 observations
per cell
- note that reported means should refer to raw latencies (in the millisecond metric) and not to
transformed variables; it is suggested to do all the analyses on the transformed data and to
transform them back into the millisecond metric when one wants to report the means (note
however, that this is only possible when the dependent variables are raw latencies; when the
dependent variables are derived from facilitation scores – i.e., difference scores – a different
procedure has to be employed, see below)
- when response latencies are used as an individual difference measure (e.g., an implicit prejudice
score) the individual difference score is calculated by subtracting the appropriate withincondition means from each other; for example, in a study examining implicit sexism of male
participants, there may have been male and female primes (MP and FP) and positive and
negative target adjectives (postarg and negtarg); the individual difference score is calculated as
follows:
sexism = (RT MP_negtarg – RT MP_postarg) – (RT FP_negtarg – RT FP_postarg)
G. The inclusion of baseline ratings:
- baseline ratings are generally used to calculate facilitation scores; the question is to what extent
the access to a particular target adjective (e.g., "lazy") is facilitated when it is preceded by a
particular prime (e.g., "BLACKS") compared to when it is not primed or preceded by a neutral
prime; note that negative facilitation scores represent inhibition
- note that facilitation scores are formed with untransformed response latencies that no longer
contain outliers; however, the determination of the cut-off points for outliers implies that the
data have been transformed previously; the analyses therefore consist of several steps: create
new data sets, find an appropriate transformation, delete outliers, calculate facilitation scores on
raw latencies, find the appropriate transformation for the facilitation scores and apply it (often
there is no need to transform facilitation scores because they tend to be unskewed); those who
prefer to use the mean (and not the median) of all observations coming from one cell, might
consider doing another outlier analysis this time on the transformed facilitation scores
- when baseline ratings are simply derived from neutral primes (e.g., "XXXXX") the baseline
RT's tend to be quite similar to those of the experimental trials and the outlier analyses are
relatively simple; the baseline RT's should be included in the new data sets ("newdata1" and
"newdata2") and the choice of the appropriate transformation and the deletion of outliers should
be done with all response latencies
- when baseline ratings are substantially faster (or slower) than experimental ratings, as is the case
in Fazio et al.'s (1995) Adjective Evaluation Task, baseline response latencies tend to possess
quite different characteristics than experimental response latencies; this is a problem because
different cut-off points for outliers may be different: it is suggested to create separate data sets
for baseline response latencies and for experimental response latencies and to do the outlier
analyses for each data set separately
6
- it is suggested to first calculate the facilitation scores, to then transform them (if necessary), and
finally, to take the median (or the mean) within cells; it is not suggested to first transfer the data,
to then take the median within cells, and finally, to subtract the medians from each other to form
facilitation scores because the medians may be based on different adjectives
Download