Psych 5500/6500 t Test for Two Independent Groups: Power Fall, 2008 Power & Beta Power is the probability that you will be able to reject H0 when H0 is actually false: p(reject H0|HO false) Beta is the probability that you will not be able to reject H0 when H0 is actually false (i.e. beta is the probability of making a Type 2 error): p(not reject H0|H0 false) p(reject H0|H0 false) + p(~reject H0|H0 false) =1.00 Power 1. 2. 3. Power is the probability you will decide to reject H0 when you should (i.e. when H0 is false). In terms of the t test for independent groups, power is the probability you will conclude that the populations have different means when in fact they do. If we have no serious confounding variables, then power is the probability that you will conclude that the independent variable had an effect when it actually did. Importance of Power Knowing the power of a research design is important for a couple of reasons: 1. As a researcher you hope to reject H0 and thus prove your independent variable had an effect. If your experiment has low power then the chances of accomplishing that even if the I.V. really did have an effect are low, and there is little reason to even run the experiment. You may also be required to estimate the power of your experiment to obtain a grant to fund it. Importance (cont.) 2. Knowing the power of your experiment can help you to interpret what it signifies if you were unable to reject H0. If you have a powerful experiment and fail to reject H0 that probably signifies that H0 is true. Power and Interpreting Failure to Reject H0 If you fail to reject H0 you are essentially saying that you failed to find a difference between the population means. As with any failure to find something, an important consideration is just how hard did you look? If you do a cursory look for something (i.e. a ‘low power’ search) and fail to find it, then it could very well be that it was there but you just didn’t find it. If you do a very thorough job of looking for something (i.e. a ‘high power’ search), then failure to find it probably means it wasn’t there. Determining Power Unlike alpha, which is set when you select a significance level, the levels of power and beta cannot be directly set by the experimenter. There are, however, things you can do to increase the power of an experiment, and if you have enough information you can estimate the resulting power of the experiment. Calculating Power For the t test for independent means, to calculate the power of an experiment you need to know: 1) the variances of the populations; 2) the N of the groups, and 3) the actual difference between the two population means. As you rarely know the values of ‘1’ or ‘3’ you need to guess, usually based upon previous, similar experiments, thus calculating power is almost always a matter of estimating rather than knowing. Sampling Distribution Assuming H0 is True From the example we used in a previous lecture (the effect of a drug on number of wrong turns made in a maze): H0: μ1= μ2, or equivalently: μ1- μ2=0. In the sampling distribution then we expect μ Y1 Y2 0 (The figure is repeated here again) This is the distribution we use to make our decision about whether or not to reject H0. Now let us consider the power of the experiment if the actual, true difference between μ1- μ2=3, this will be our HA. Hypotheses H0: μ1 - μ1 = 0 HA: μ1 - μ1 = 3 The estimate of power is based upon a specific alternative hypothesis, in this case that μ1 - μ1 = 3 Sampling Distribution Assuming HA is True The sampling distribution assuming H0 is how we make our decision (we reject H0 if the difference between the sample means is greater than ±2.33), but if HA is true then the curve on the right actually reflects reality, and the shaded area represents the probability that we will obtain a result that allows us to reject H0, in other words, the shaded area is the power of the experiment, power .70 The shaded area of the sampling distribution assuming HA is true is the power of the experiment, to compute that exactly (I ‘eyeballed’ that power was about .70) we would need to find out what proportion of the ‘S.D. assuming HA is true’ falls above 2.33. To do that we need to find the standard score (t value) for 2.33 on the Ha curve (we use the standard error estimate of 1.03 calculated from our samples): The ‘t distribution’ tool I have provided 2.33 3 t 0.65 tells us that the proportion of the t dist that falls above -.65 for df=9 is 0.734, 1.03 which is the power of this experiment. Increasing Power There are four ways* to increase the power of an experiment : 1. Increase the effect of the I.V. 2. Increase the size of the samples. 3. Decrease the variance of the populations. 4. Perform a one-tail test. We will take a look at each in turn. It is important to note that these steps do not manipulate the data in an unfair way, they only increase the chances of rejecting H0 when the independent variable really did have an effect! *Note that power is also influenced by whether or not the assumptions underlying the test are met. Increasing Effect of the I.V. If you want to demonstrate that the independent variable had an effect, it will be easier to do so if you make it have a larger effect. For example, if we increased the dosage of the drug so that the difference in mean maze running ability becomes greater between the drug group and the no-drug group... Instead of HA : μ 1 μ 2 3 we move toHA : μ 1 μ 2 5 Original example: difference between means = 3 I.V. stronger, difference between means = 5, note more power... Two important points here: 1. If your I.V. really doesn’t have an effect then trying to increase it’s strength won’t work, the curve according to H0 is reality in this case, and so the probability of rejecting H0 remains at .05. Remember, attempts to increase the power of an experiment only work when they should (i.e. when H0 really is false). 2. Be careful not to fall into the trap of increasing the strength of the I.V. to the point where it is no longer applicable to real life just to try to get a statistically significant result. Increasing N Decreasing σ² Increasing the size of the samples and decreasing the variance of the populations both influence power by decreasing the standard error of the sampling distribution. They both make the means of the two groups more likely to be representative of the means of their populations, and thus any differences in the group means are more likely to represent real differences in the population means. Again, if H0 is true then ‘the sampling distribution assuming H0 is true’ reflects reality, and these two steps do nothing (other than making it more likely that Y1 Y2 will be close to zero). Original Example Here the standard error has been decreased by increasing N and/or by decreasing the variance of the original populations. Note more power. On Increasing N 1. 2. Increasing N helps not only by decreasing the standard error of the sampling distribution, but also by decreasing the tcritical value (moving the rejection regions closer to the mean of the sampling distribution assuming H0 is true). The strategy of increasing N to increase power can be abused. With a huge N even the tiniest effect can be statistically significant, even when the effect is too small to be of theoretical or social significance. On Decreasing σ² To decrease the variance of the population you can either: 1. Choose more homogeneous populations from which to sample (which will narrow the populations to which you can generalize the results). 2. Select an experimental design that has this built into it. One example is the ‘repeated measures design’ (it is analyzed with the ‘t test for correlated groups’ which we will cover next). A more elegant example is the ‘model comparison approach’ which we will cover next semester. One-Tailed Tests One-tailed tests are more powerful than twotailed tests if (and only if) the results do indeed fall in the direction predicted by HA. Of course, you need appropriate justification to make the test one-tailed (we covered that earlier). Original Example In the one-tailed test the line of the rejection region shifts toward the mean of the ‘S.D. assuming H0 is true’, increasing power.