Type I and Type II errors and Power

advertisement
Statistics 215
Type I and Type II errors and Power
Case Study: After receiving numerous requests from students, the Dining Hall is considering
installing a new brick wood-fired pizza oven for baking genuine Neapolitan pizza. (Evidently, the
only other location in Minnesota where one can get genuine Neapolitan-style pizza is Punch
Pizza in St. Paul.) The oven is expensive and the Dining Hall intends to poll the student body to
determine if there is at least 40% support for such a costly purchase. They intend to take a
random sample of 25 students in order to determine student support.
1. Set up the appropriate hypotheses for a statistical test, clearly identifying the population and
parameter.
2. What would be the consequences of making a Type I and Type II error?
3. Draw a picture of the sampling distribution of the sample statistic ( p̂ ) under the null
hypothesis (if Ho is true). Find the standard deviation of the sampling distribution. Make sure to
label the x-axis.
4. The hypothesis test will be conducted at the α = 2.5% level. By using the 68-95-99.7 rule we
see that this test will reject the null hypothesis if the observed value of the statistic (p-hat) is
greater than 0.60. (Show this on your picture.) In other words, the decision rule for this test is:
Reject Ho,
Do not reject Ho
if p̂ > 0.6
if p̂ ≤ 0.6.
Since at this stage we are assuming that p = 0.40 and the null hypothesis is true, the rejection
region to the right of 0.60 represents the region where we incorrectly reject the null hypothesis.
That is, we commit a Type I error. Thus the area to the right of 0.60 is Prob(Type I error). Notice
that we could decrease the probability of Type I error by moving the vertical line to the right, that
is, by making a more stringent decision rule for when to reject Ho.
5. Suppose that the Dining Hall staff (before they conduct their poll) suspects that in fact 50% of
the student body would support the change. That is, they suspect that p = 0.50. We will look at
the consequences of this. First of all, if that were in fact the case, then we should expect that when
we conduct the hypothesis test, we will reject the null hypothesis (since it’s in fact false).
Is that, however, what will actually happen?
6. Here you will simulate the random survey and the hypothesis test under the assumption that
there is 50% support for the change. Start your simulation at the line number given at the top of
this handout. Take a sample of size 25, where each student has a 50% chance of SUPPORT and a
50% chance of NON-SUPPORT. Compute p̂ from your sample. Now use the decision rule on
the other side of the page to determine whether or not you will accept or reject the null
hypothesis.
For my survey, p̂ = _________ therefore I will [ reject, not reject ] the null hypothesis.
7. Let’s take a closer look at what’s going on. If the null hypothesis is false and p = 0.50 draw the
sampling distribution of p̂ . Draw a vertical line through 0.60. By our decision rule, if the
observed p̂ value is to the left of the line, we do not reject Ho, and if it is to the right of the line
we do reject Ho. Mark your observed p̂ value on the horizontal axis.
The observed values to the left of the line represent outcomes where we don’t reject the null
hypothesis. But the null hypothesis is false. Thus, these points represent Type II error. And the
probability of Type II error P(Type II error) is represented by the shaded region to the left of the
line. We will discuss the inverse relationship between P(Type I error) and P(Type II error).
If we decrease one, we increase the other.
8. The power of the test is defined as β = 1 – P(Type II error) = P(Rejecting Ho when Ho is false).
We would certainly like to reject Ho when it is false, so we want the power to be as large as
possible. On your graph, the power is the area to the right of the vertical line. Many statistical
studies, such as those done for the government, insist on a minimum power of 80%.
9. The problem with the Dining Hall study is that there is not enough power to detect an actual p
value of 50%. As we saw in the simulation, most of our tests failed to reject. How can we
increase power? The most direct way is to increase the sample size. In many statistical studies, a
big question is determining the appropriate sample size that will give sufficient power. This is a
more advanced topic that we won’t cover in this class. But let’s look what happens when we
increase the sample size to 100. . . .
Download