Randomness and hypothesis tests: Some concepts to think about Goal: Data:

advertisement
Randomness and hypothesis tests: Some concepts to think about
Goal:
Make a decision about whether or not a population parameter is equal to a particular value.
Data:
The decision is made from randomly sampled data from the population. The data are the clues from
which we will base our decision.
Randomness:
The data values are random, thus any function of the data (e.g. test statistic) is a random variable.
Random variables are described by their distribution.
Distribution of the test statistic:
Being able to conceptualize the distribution of a test statistic is the core idea behind hypothesis testing.
We only have one solid reference point from which to determine the distribution of the test statistic.
That reference point is the null hypothesis, Ho. Thus the distribution from which we do our p-value
calculations assumes Ho is true.
Even if Ho is true, we would still expect the rare extreme result of the test statistic – this leads us to
Type 1 errors. If alpha=0.05, we will commit a Type 1 error 5% of the time if Ho is true. (Think in terms
of critical values here.)
P-value:
The p-value is the probability, if Ho were true, of getting a test statistic as extreme or more extreme
than the test statistic we got from the data. Extreme values of the test statistic make us second guess
our assumption that Ho is true and perhaps we should reject it. If the alternative hypothesis, Ha, is true,
then the distribution of the test statistic will actually be shifted. The difficulty is that we don’t know how
much it is shifted and that’s why we do our calculations assuming Ho is true – it’s the only non-arbitrary
point.
Power:
Power is the probability of rejecting Ho when Ha is true. (So just like the p-value, power is a conditional
probability.) The difficulty is that we do not know what the true value of the parameter is, so we
calculate power for potential values of the parameters. Power calculations are done during the
planning stage of an experiment so that we can determine appropriate sample sizes. Keeping Ho is
often a result of not having enough power (typically because of too small of a sample size) to detect the
difference between the true parameter value and that put forth by Ho. Of course, with a very large
sample size, there can be issues of statistical significance vs. biological significance.
Download