Psych 5500/6500
t Test for Two Independent Groups:
Fall, 2008
Power & Beta
Power is the probability that you will be able to reject
H0 when H0 is actually false:
p(reject H0|HO false)
Beta is the probability that you will not be able to
reject H0 when H0 is actually false (i.e. beta is the
probability of making a Type 2 error):
p(not reject H0|H0 false)
p(reject H0|H0 false) + p(~reject H0|H0 false) =1.00
Power is the probability you will decide to reject
H0 when you should (i.e. when H0 is false).
In terms of the t test for independent groups,
power is the probability you will conclude that
the populations have different means when in
fact they do.
If we have no serious confounding variables,
then power is the probability that you will
conclude that the independent variable had an
effect when it actually did.
Importance of Power
Knowing the power of a research design is important
for a couple of reasons:
As a researcher you hope to reject H0 and thus
prove your independent variable had an effect. If
your experiment has low power then the chances
of accomplishing that even if the I.V. really did
have an effect are low, and there is little reason
to even run the experiment. You may also be
required to estimate the power of your
experiment to obtain a grant to fund it.
Importance (cont.)
Knowing the power of your experiment
can help you to interpret what it signifies if
you were unable to reject H0. If you have
a powerful experiment and fail to reject
H0 that probably signifies that H0 is true.
Power and Interpreting
Failure to Reject H0
If you fail to reject H0 you are essentially saying that
you failed to find a difference between the
population means. As with any failure to find
something, an important consideration is just how
hard did you look? If you do a cursory look for
something (i.e. a ‘low power’ search) and fail to
find it, then it could very well be that it was there
but you just didn’t find it. If you do a very
thorough job of looking for something (i.e. a ‘high
power’ search), then failure to find it probably
means it wasn’t there.
Determining Power
Unlike alpha, which is set when you select a
significance level, the levels of power and
beta cannot be directly set by the
experimenter. There are, however, things
you can do to increase the power of an
experiment, and if you have enough
information you can estimate the resulting
power of the experiment.
Calculating Power
For the t test for independent means, to
calculate the power of an experiment you
need to know: 1) the variances of the
populations; 2) the N of the groups, and 3)
the actual difference between the two
population means. As you rarely know the
values of ‘1’ or ‘3’ you need to guess, usually
based upon previous, similar experiments,
thus calculating power is almost always a
matter of estimating rather than knowing.
Sampling Distribution
Assuming H0 is True
From the example we used in a previous lecture (the effect of a drug
on number of wrong turns made in a maze):
H0: μ1= μ2, or equivalently: μ1- μ2=0. In the sampling
distribution then we expect μ Y1 Y2  0
(The figure is repeated here again)
This is the distribution we use to make our decision about whether
or not to reject H0. Now let us consider the power of the experiment
if the actual, true difference between μ1- μ2=3, this will be our HA.
H0: μ1 - μ1 = 0
HA: μ1 - μ1 = 3
The estimate of power is based upon a
specific alternative hypothesis, in this
case that μ1 - μ1 = 3
Sampling Distribution Assuming
HA is True
The sampling distribution assuming H0 is how we make our decision
(we reject H0 if the difference between the sample means is greater
than ±2.33), but if HA is true then the curve on the right actually
reflects reality, and the shaded area represents the probability that we
will obtain a result that allows us to reject H0, in other words, the
shaded area is the power of the experiment, power  .70
The shaded area of the sampling distribution assuming HA is true is
the power of the experiment, to compute that exactly (I ‘eyeballed’
that power was about .70) we would need to find out what proportion
of the ‘S.D. assuming HA is true’ falls above 2.33. To do that we need
to find the standard score (t value) for 2.33 on the Ha curve (we use the
standard error estimate of 1.03 calculated from our samples):
The ‘t distribution’ tool I have provided
2.33  3
 0.65 tells us that the proportion of the t dist
that falls above -.65 for df=9 is 0.734,
which is the power of this experiment.
Increasing Power
There are four ways* to increase the power of an
experiment :
Increase the effect of the I.V.
Increase the size of the samples.
Decrease the variance of the populations.
Perform a one-tail test.
We will take a look at each in turn. It is important to
note that these steps do not manipulate the data
in an unfair way, they only increase the chances
of rejecting H0 when the independent variable
really did have an effect!
*Note that power is also influenced by whether or not
the assumptions underlying the test are met.
Increasing Effect of the I.V.
If you want to demonstrate that the
independent variable had an effect, it will be
easier to do so if you make it have a larger
effect. For example, if we increased the
dosage of the drug so that the difference in
mean maze running ability becomes greater
between the drug group and the no-drug
Instead of HA : μ 1  μ 2  3 we move toHA : μ 1  μ 2  5
Original example: difference between means = 3
I.V. stronger, difference between means = 5, note more power...
Two important points here:
If your I.V. really doesn’t have an effect then
trying to increase it’s strength won’t work, the
curve according to H0 is reality in this case, and
so the probability of rejecting H0 remains at .05.
Remember, attempts to increase the power of an
experiment only work when they should (i.e.
when H0 really is false).
Be careful not to fall into the trap of increasing
the strength of the I.V. to the point where it is no
longer applicable to real life just to try to get a
statistically significant result.
Increasing N
Decreasing σ²
Increasing the size of the samples and decreasing
the variance of the populations both influence
power by decreasing the standard error of the
sampling distribution. They both make the means
of the two groups more likely to be representative
of the means of their populations, and thus any
differences in the group means are more likely to
represent real differences in the population means.
Again, if H0 is true then ‘the sampling distribution
assuming H0 is true’ reflects reality, and these two
steps do nothing (other than making it more likely
that Y1  Y2 will be close to zero).
Original Example
Here the standard error has been decreased by increasing N and/or
by decreasing the variance of the original populations. Note more power.
On Increasing N
Increasing N helps not only by decreasing the
standard error of the sampling distribution, but
also by decreasing the tcritical value (moving the
rejection regions closer to the mean of the
sampling distribution assuming H0 is true).
The strategy of increasing N to increase power
can be abused. With a huge N even the tiniest
effect can be statistically significant, even when
the effect is too small to be of theoretical or
social significance.
On Decreasing σ²
To decrease the variance of the population you can
Choose more homogeneous populations from
which to sample (which will narrow the
populations to which you can generalize the
Select an experimental design that has this built
into it. One example is the ‘repeated measures
design’ (it is analyzed with the ‘t test for
correlated groups’ which we will cover next). A
more elegant example is the ‘model comparison
approach’ which we will cover next semester.
One-Tailed Tests
One-tailed tests are more powerful than twotailed tests if (and only if) the results do
indeed fall in the direction predicted by HA.
Of course, you need appropriate justification to
make the test one-tailed (we covered that
Original Example
In the one-tailed test the line of the rejection region shifts toward
the mean of the ‘S.D. assuming H0 is true’, increasing power.