Factors Affecting Power

advertisement
EDF 802
Dr. Jeffrey Oescher
A Pictorial Discussion of Factors Affecting Power
A statistical test of a one-tailed research hypothesis (i.e., the alternative hypothesis) versus the
null hypothesis is conducted based on where the mean of a randomly drawn sample from the population
in question falls. If the sample mean falls into the critical region, the null hypothesis is rejected and there
is statistical support for the research hypothesis. The probability that the sample mean lands in the critical
region when the null hypothesis is in fact true - the probability of a Type I error - is called the significance
level of the test and is the blue region in the Figure 1. The smaller this number, the more statistical
support the test gives the research hypothesis.
If the true mean has some value μ that is different from that specified in the null hypothesis, the
probability that the sample mean lands in the critical region is called the power of the test. Roughly
speaking, this number measures the likelihood that the test will correctly detect a discrepancy between
the true mean μ and the comparison value. That is, power represents a correct decision to reject the null
hypothesis when it should be rejected.
Power is affected by three things: effect size, alpha level, and sample size. Figure 1 depicts a
situation where H0: µ = 0, α = .057, and the hypothesized value of the mean is 2.00 which represents a
moderate effect size. The power of this analysis given these assumptions is 0.70, slightly less than the
commonly accepted level of 0.80. Figure 2 depicts the same test with the exception that the hypothesized
value of the mean has been increased to 2.50, thus increasing the effect size. While this does not affect
alpha, power was increased to 0.94, a value well above the recommended level. Thus, increasing effect
size will lead to an increase in the level of power. This is one of the reasons why Kerlinger originally
created the principal of maximizing experimental variance when designing a study. Figure 3 reverts back
to the original analysis with the exception of decreasing the alpha level from .057 to .018. As can be seen,
this results in a decrease in power to an unacceptable level of 0.50. In addition, the level of alpha should
be set on the basis of the researcher’s values related to committing a Type I error. Moving alpha once it
has been set is considered unethical. Figure 4 reverts to the original analysis with the exception of
increasing the sample size of the group from 10 to 20. As can be seen, the two distributions are
considerably more narrow. This effectively decreases the overlap between the distributions and thus
increases power. An interesting aspect of this figure is the location of the original test statistic. In terms of
the distribution hypothesized under the assumption of the null hypothesis, the test statistic falls well
outside the critical region. With respect to the distribution based on a hypothesized value of 2.00, the
power associated with the analysis is acceptable at 0.798. Thus, increasing the sample size – something
under the direct control of the researcher – can result in acceptable levels of power without sacrificing
alpha or effect size.
Figure 1
Power of a One Sample Test of the Mean1
Figure 2
The Effect of Increasing Effect Size on Power1
Figure 3
The Effect of Decreasing Alpha on Power1
Figure 4
The Effect of Increasing Sample Size on Power1
1
All Figures were copied from Boucher, C., (20112). "The Power of a Test Concerning the Mean of a
Normal Population" from the Wolfram Demonstrations Project
http://demonstrations.wolfram.com/ThePowerOfATestConcerningTheMeanOfANormalPopulation/
Download