
Permutation Tests
… and other randomization
What do we need for a statistical test?
• 1) A test statistic (e.g. t-statistic, correlation)
that measures departures from the hypothesis
• 2) A direction of the test statistic which is in
favor of the alternative (e.g. reject when large)
• 3) Knowledge of the distribution of the test
statistic when the hypothesis is true
This last item is problematic when assumptions
regarding the distribution are questioned
Permutation tests
Permutation tests do not need to make much in
the way of distributional assumptions – just
what shuffling of the data would look the
same when the hypothesis is true. Then we
can just do a number of random shufflings of
the data and see how extreme the test
statistic might be under the permutation
Permutation tests -- procedure
• Compute the test statistic for the original data
T0 = T(original data)
• Generate new samples i=1,…,N of the data Ti =
T(each new permutation i)
• Count how many Ti’s are bigger (if we reject
for large T) than T0, so that the empirical pvalue is #(Ti > T0)/N
• … related to bootstrap and jackknife
Permutation distributions
• For two-sample t-test – randomly assign group
names (say A and B) to the observations
• ANOVA – just extend for many groups
• Correlation, regression (including logistic
regression) – randomly match up x and y