EDF 802 Dr. Jeffrey Oescher A Pictorial Discussion of Factors Affecting Power A statistical test of a one-tailed research hypothesis (i.e., the alternative hypothesis) versus the null hypothesis is conducted based on where the mean of a randomly drawn sample from the population in question falls. If the sample mean falls into the critical region, the null hypothesis is rejected and there is statistical support for the research hypothesis. The probability that the sample mean lands in the critical region when the null hypothesis is in fact true - the probability of a Type I error - is called the significance level of the test and is the blue region in the Figure 1. The smaller this number, the more statistical support the test gives the research hypothesis. If the true mean has some value μ that is different from that specified in the null hypothesis, the probability that the sample mean lands in the critical region is called the power of the test. Roughly speaking, this number measures the likelihood that the test will correctly detect a discrepancy between the true mean μ and the comparison value. That is, power represents a correct decision to reject the null hypothesis when it should be rejected. Power is affected by three things: effect size, alpha level, and sample size. Figure 1 depicts a situation where H0: µ = 0, α = .057, and the hypothesized value of the mean is 2.00 which represents a moderate effect size. The power of this analysis given these assumptions is 0.70, slightly less than the commonly accepted level of 0.80. Figure 2 depicts the same test with the exception that the hypothesized value of the mean has been increased to 2.50, thus increasing the effect size. While this does not affect alpha, power was increased to 0.94, a value well above the recommended level. Thus, increasing effect size will lead to an increase in the level of power. This is one of the reasons why Kerlinger originally created the principal of maximizing experimental variance when designing a study. Figure 3 reverts back to the original analysis with the exception of decreasing the alpha level from .057 to .018. As can be seen, this results in a decrease in power to an unacceptable level of 0.50. In addition, the level of alpha should be set on the basis of the researcher’s values related to committing a Type I error. Moving alpha once it has been set is considered unethical. Figure 4 reverts to the original analysis with the exception of increasing the sample size of the group from 10 to 20. As can be seen, the two distributions are considerably more narrow. This effectively decreases the overlap between the distributions and thus increases power. An interesting aspect of this figure is the location of the original test statistic. In terms of the distribution hypothesized under the assumption of the null hypothesis, the test statistic falls well outside the critical region. With respect to the distribution based on a hypothesized value of 2.00, the power associated with the analysis is acceptable at 0.798. Thus, increasing the sample size – something under the direct control of the researcher – can result in acceptable levels of power without sacrificing alpha or effect size. Figure 1 Power of a One Sample Test of the Mean1 Figure 2 The Effect of Increasing Effect Size on Power1 Figure 3 The Effect of Decreasing Alpha on Power1 Figure 4 The Effect of Increasing Sample Size on Power1 1 All Figures were copied from Boucher, C., (20112). "The Power of a Test Concerning the Mean of a Normal Population" from the Wolfram Demonstrations Project http://demonstrations.wolfram.com/ThePowerOfATestConcerningTheMeanOfANormalPopulation/