Single Mean Hypothesis Testing

advertisement
SINGLE MEAN HYPOTHESIS TESTING1
The major characteristic of this type of hypothesis testing is that you are testing that a mean of a sample
is equal to some constant “a” (like 3 or 6 or any number; you could even test that it’s equal to e or π!).
You also have to distinguish between the times that you know the population standard deviation (σ) and
those that you don’t which would require you to use the sample standard deviation (s). If you know σ,
you can use a Z-distribution. Otherwise you’re required to use the t-distribution which is a substitute for
the Z.
Okay, so let’s get down to it. Since my wife and I have two dachshunds that we will breed from time to
time, I’ll use that as my backdrop. Suppose I want to conduct a hypothesis test (α=0.05) that the
average litter of a dachshund is 5 puppies based on a sample of 23 that produced a sample mean of 3.4
and I know that the population standard deviation is 1.2 puppies.
What I’m going to do now is reprint that last paragraph and highlight the important points to note.
“Suppose I want to conduct a hypothesis test (α=0.05) that the average litter of a dachshund is 5
puppies based on a sample of 23 that produced a sample mean of 3.4 and I know that the population
standard deviation is 1.2 puppies.”
“Population standard deviation”: The fact that we know the population value means we will use the Zdistribution. Had we only know the sample version, this would indicate we are using the t-table.
“α=0.05”: This is going to tell you what critical value to use. Since we are using the Z-table, this indicates
the following2.
The red area is the “rejection region” which equals the α that we are using. Since this is a two-sided
test, each side must equal 0.025. This leaves 0.95 in the blue and green areas combined. Our Z-table
will give us the value of the green area. This is half of the 0.95 which is 0.475. This 0.475 is the area on
the table that we should be looking for in the “guts”. When we locate 0.475, we can see the
corresponding Z-value is 1.96. That is what we’ll use for the critical value in a moment.
1
2
This is written for someone who has at least a “skeleton” knowledge of testing.
See Appendix A for more practice on finding a Z-based critical value.
The remaining numbers are used to calculate our Z-statistic. I’ll get to those in a bit when I complete the
test.
Since we are doing a hypothesis test, we are required to do all six steps in the procedure. These are:
1.
2.
3.
4.
5.
6.
Specify the null and alternative hypothesis
Determine the critical value
State the decision rule
Calculate the test statistic
Make decision
State conclusion
Here we go.
1. Specify the null and alternative hypothesis
Since we are testing whether or not the average litter is 5 puppies, our null hypothesis is exactly that.
The alternative is the complement, so it’s the “opposite” of the null.
H0: μ = 5
HA: μ ≠ 5
2. Determine the critical value
We did this earlier, so it’s now a matter of stating it formally.
Zcrit = ± 1.96
3. State the decision rule
All we’re doing here is putting in words the picture from the previous page.
If Zstat > 1.96 or < -1.96, reject H0. Else, fail to reject H0.
4. Calculate test statistic
This is the equation we learned in class.
𝑥̅ − 𝜇
𝑍𝑠𝑡𝑎𝑡 = 𝜎
⁄ 𝑛
√
All we need to do here is plug in the values we know. 𝑥̅ is our sample average or 3.4, μ is 5 as we pull it
directly from the null hypothesis, σ is our known population standard deviation or 1.2, and n is simply
our sample size of 23.
𝑍𝑠𝑡𝑎𝑡 =
3.4 − 5
= −6.39
1.2⁄
√23
5. Make Decision
Now we can utilize our decision rule. Since -6.39 is well less than our critical value of -1.96, we are going
to reject H0.
Reject H0
6. State Conclusion
We try to avoid using statistical terms at this stage; we want to say this so that anyone can understand
our result.
It would appear that the average litter of dachshund puppies is not 5.
Try doing this one on your own. The answer is on the succeeding page.
Based on a sample of 34 that produces a sample mean of 7.4, conduct a hypothesis test (α=0.01) that
the population average is 7 assuming that σ = 2.1.
H0: μ = 7
HA: μ ≠ 7
Zcrit = ±2.58
If Zstat > 2.58 or < -2.58, then reject H0. Else fail to reject H0.
7.4−7
Zstat = 2.1
⁄
√34
= 1.11
Fail to reject H0.
It appears that the average is not different than 7.
How did you do?
Now let’s try this one:
“Suppose I want to conduct a hypothesis test (α=0.05) that the average litter of a dachshund is 5
puppies based on a sample of 23 that produced a sample mean of 3.4 and a sample standard deviation
of 1.2.”
Look familiar? It’s the same test I walked you through earlier with one twist. See it?
Notice that there is no mention of the population standard deviation. It does indicate that we know the
sample standard deviation, however that fact alone changes our method. We can only use the Zdistribution when we know σ. We can use “s” as a substitute, however it requires that we now use the
t-distribution in sample sizes less than 500. Other than that, you’ll see that the test is effectively the
same since I simply reproduced the same numbers.
Here we go.
1. Specify the null and alternative hypothesis
H0: μ = 5
HA: μ ≠ 5
2. Determine the critical value
When using the t-table, you have two steps. First, note the three rows at the top. The one you want
right now is “Two Tails” since this is a two-sided test. These are alphas that are listed. Next, you’ll need
to know the degrees of freedom so you can locate the proper row. The degrees of freedom3 for this test
is n-1. So our column here will be “0.05” as that is our alpha level . Our test has n-1=22, so the table
gives us:
tcrit = ±2.0739
3
See Appendix B for a brief discussion of degrees of freedom.
3. State the decision rule
This needs to reflect the fact that we’re now using a t-distribution, but it essentially is the same idea.
If tstat > 2.0739 or < -2.0739, reject H0. Else, fail to reject H0.
4. Calculate test statistic
This is the equation we learned in class.
𝑥̅ − 𝜇
𝑡𝑠𝑡𝑎𝑡 = 𝑠
⁄ 𝑛
√
All we need to do here is plug in the values we know. 𝑥̅ is our sample average or 3.4, μ is 5 as we pull it
directly from the null hypothesis, s is our sample standard deviation or 1.2, and n is simply our sample
size of 23.
𝑍𝑠𝑡𝑎𝑡 =
3.4 − 5
= −6.39
1.2⁄
√23
5. Make Decision
Reject H0
6. State Conclusion
It would appear that the average litter of dachshund puppies is not 5.
Another one for you to try…
Based on a sample of 22 that produces a sample mean of 1.2 and sample standard deviation of 3.1,
conduct a hypothesis test (α=0.02) that the population average is 2.
H0: μ = 2
HA: μ ≠ 2
tcrit (df =22 – 1 = 21) = ±2.5176
If Zstat > 2.5176 or < -2.5176, then reject H0. Else fail to reject H0.
1.2−2
Zstat = 3.1
⁄
√22
= −1.21
Fail to reject H0.
It appears that the average is not different than 2.
FINAL NOTE:
Please pay careful attention to the information you are given. The key feature is whether or not you
know the population standard deviation (σ). This will determine which distribution you are using. In
fact, this will be important for some time to come.
APPENDIX A
Finding Zcrit.
Try finding the following critical values for two-sided tests based on the alpha levels.
1.
2.
3.
4.
α=0.20
α=0.02
α=0.15
α=0.50
1. α=0.20
The sum of the tails is 0.20, so each tail is 0.10. This means the table value we want to find in
the guts is 0.40. 0.3997 is as close as you can get, so the corresponding Z-value is 1.28.
2. α=0.02
The sum of the tails is 0.02, so each tail is 0.01. This means the table value we want to find in
the guts is 0.4900. 0.4901 is as close as you can get, so the corresponding Z-value is 2.33.
3. α=0.15
The sum of the tails is 0.15, so each tail is 0.075. This means the table value we want to find in
the guts is 0.425. 0.4251 is as close as you can get, so the corresponding Z-value is 1.44
4. α=0.50
The sum of the tails is 0.50, so each tail is 0.25. This means the table value we want to find in
the guts is 0.25. 0.2486 is as close as you can get, so the corresponding Z-value is 0.67.
I will suggest that you become familiar with the following alpha levels and their corresponding two-tail
Z-values. These are very commonly used in general and extremely commonly used by me!
α=0.10, Zcrit = 1.645
α=0.05, Zcrit = 1.96
α=0.01, Zcrit = 2.58
APPENDIX B
Degrees of Freedom
The basic idea behind degrees of freedom is that it measures the number of parameters that are
allowed to vary. In the beginning this is exactly equal to the sample size. However when you estimate
pieces of the equation that is not what you are testing, you lose one degree of freedom per estimated
piece.
Examine once again the Zstat equation.
𝑥̅ − 𝜇
𝑍𝑠𝑡𝑎𝑡 = 𝜎
⁄ 𝑛
√
Notice that nothing in here is estimated with the exception of 𝑥̅ . This doesn’t count because that’s what
we’re testing. As we know both σ and n, we don’t need to use any of our degrees of freedom to
estimate them thus degrees of freedom are irrelevant in this statistic.
Now, look at the tstat.
𝑥̅ − 𝜇
𝑡𝑠𝑡𝑎𝑡 = 𝑠
⁄ 𝑛
√
We have to estimate “s” in this equation. This costs us a degree of freedom. Since it is all we are
estimating, we only lose that one so our degrees of freedom is n – 1.
There’s a nice way to see this. Degrees of freedom are almost always located in the equation. It’s not
totally obvious here, but if you recall the equation for s, you’ll see
See the denominator? There it is.
∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2
𝑠=
𝑛−1
Download