Variance Testing

advertisement
TESTING WITH VARIANCES
As mentioned in class, the variance presents a new problem. We’d like to be able to directly test the
standard deviation, however that is not possible so we test its square. This fact alone causes issues with
our previous Z and t distributions. Since the variance can never be negative, we are required to use a
new distribution. For testing single variances, we will use the square of the normal distribution which is
called a chi-square (χ2) distribution.
Let’s start with a confidence interval. With a confidence interval around a mean, we had a “±” term.
The variance is slightly different.
(𝑛 − 1)𝑠 2
(𝑛 − 1)𝑠 2
2
≤𝜎 ≤
2
2
πœ’π‘ˆπ‘π‘π‘’π‘Ÿ
πœ’πΏπ‘œπ‘€π‘’π‘Ÿ
The “n – 1” and “s2” are exactly what you think they are; they are the sample size – 1 and sample
variance respectively. The new term you’re looking at is the denominators. We’ll need to get these
from the table.
To show you how this works, let’s do a simple confidence interval. Suppose you have a sample variance
of 4 based on a sample of 8 and you want to do a 90% confidence interval. We know right off the bat
that n = 8 (and thus n – 1 = 7) and s2 = 4. To figure out the denominator values, we call upon the fact
that we are looking for a 90% interval. This means that our picture would look like this:
Look familiar? It’s the same general idea as the confidence interval around means; the only difference is
that the shape of the distribution is different. This brings up another issue; the χ2 is not symmetric. We
cannot simply find one number and use its negative counterpart.
So let’s focus on finding the values for the denominators. First of all, we know immediately that the sum
of the red areas is 0.10 (or at least you should by now). This tells us that each red area on its own has
area of 0.05. This is where we’ll use the table. The table in your text gives us the area to the right of
certain values. So what row should we be using for our particular problem? 0.05 and 0.95. Here’s why:
We know the table gives us area to the right. We know that the red area to the right of the line that
“chops” the right tail is 0.05. So this is exactly what the table gives us. But why 0.95? Take a gander at
this picture.
We are concerned with the number that corresponds to the boundary between the grey and red areas
in the picture you see above. We know the far right red area (the right tail) is 0.05 (see above). We also
know that the newly shaded red area is 0.90. The combination of the two is 0.95; this gives us the value
we need.
Oh, one more thing. You’re going to need degrees of freedom again. Have any guesses? Look at the
equation…there it is. Your degrees of freedom is (n – 1) located in the equation (as usual). So now we
can get the values we need. They are 2.1673 and 14.0671 (df = 7). Now simply plug those into the
equation you saw earlier and you get:
(8 − 1)4
(8 − 1)4
≤ 𝜎2 ≤
14.0671
2.1673
So we are 90% confident that our variance is between 1.99 and 12.919.
See later pages for another example.
Now let’s try running a hypothesis test on our single variance. We’ll use the same data (s2 = 4, n = 8).
Try this one:
Conduct a hypothesis test (α=0.02) that the population variance is equal to 7.
As usual, we have to complete the full six steps. Fortunately, they are not all that different in the result
that those you learned for the single mean test. Here we go.
Step 1) You should at least know that there will be an H0 and an HA. But, notice our claim is that the
variance is 7. Just like before, that’s what goes in the null and the complement of that is the alternative.
H0: σ2 = 7, HA: σ2 ≠ 7
Step 2) Now we need our critical value. For a single variance we are using the χ2 table just like with the
confidence intervals. The fact that your alpha level is 0.02 means that each tail contains 0.01 of area.
That indicates that the two “rows” you’re interested in are 0.01 and 0.99 respectively. With your n – 1
degrees of freedom (n – 1 = 7), you find critical values of 1.2390 and 18.4753.
χ2crit = 1.2390 and 18.4753
Step 3) All you do (again) is state in words what the graph would show.
If χ2stat > 18.4753 or < 1.2390, reject H0. Else, fail to reject H0.
2
Step 4) Time to calculate your test statistic. The equation for this is: πœ’π‘ π‘‘π‘Žπ‘‘
=
(𝑛−1)𝑠2
.
𝜎2
Just like with the
mean tests, you’re getting the number to replace your Greek letter from the hypotheses.
πŒπŸπ’”π’•π’‚π’• =
(πŸ– − 𝟏)πŸ’
=πŸ•
πŸ•
Steps 5 and 6) From here on in, you know what to do.
Fail to reject H0. We have no reason to believe that the variance is different than 7.
See later pages for another example.
Now it’s time for the fun one. We can actually test that two variances are equal! You’ll see in a bit that
the test statistic (step 4) involves dividing two variances, so we are effectively dividing two χ2
distributions. When you do that, your result is a new distribution called the F distribution (and thank
goodness, I was frankly tired of continuing to have to locate the Greek letter chi).
There are a few things to make you aware of about the F-tables in your book. First, F-tables require two
degrees of freedom; one is for the denominator and the other for the numerator. This should come
somewhat intuitively as I’ve already mentioned that we will be using division in step 4. Second, your
tables only give you the values on the right tail. If we are conducting tests that are looking for equality
of variances, they have to be two sided (just like all the others we’ve encountered). How we deal with
this I will discuss in step 2.
Last and certainly not least, you have a separate F-table for each alpha level. This means that when you
are using the F-table to find your critical value, the first step is to choose the proper table. If you are
doing a one-sided test, simply find the alpha level that matches that of the problem. If you are doing a
two-sided test, find the table that matches half of your given alpha level. So for a two-sided test with
α=0.05, the table you choose is 0.025. So, let’s take off and test a couple of variances.
Conduct a hypothesis test (α=0.02) that the following two variances are equal: s21 = 4, n1 = 8, s22 = 6, n2 =
6.
Step 1) Think, what are we testing? Got it? So write it!
H0: σ21 = σ22, HA: σ21 ≠ σ22
Step 2) I mentioned earlier that we need a numerator and denominator degrees of freedom. In this
test, both df are n – 1. Ah…but which is the numerator and which denominator? Here’s where we’re
going to fix one of those problems with the F-table. Always put the larger variance in the numerator.
What this means is that our s22 will be the numerator. The sample size follows the variance, so
numerator df = 6 – 1 and denominator df = 8 – 1. Since our alpha level of 0.02 tells us we look for the
table labeled 0.01, we’re ready.
Fcrit = 7.460
Step 3) Yes, I’m making you do this yet again.
If Fstat > 7.460, reject H0. Else, fail to reject H0.
Step 4) Now is where you need to remember that the larger variance goes on top. The test statistic is:
πΉπ‘ π‘‘π‘Žπ‘‘ =
𝑠𝑖2
, π‘€β„Žπ‘’π‘Ÿπ‘’ 𝑠𝑖2 > 𝑠𝑗2
𝑠𝑗2
If you do it backwards, you will always fail to reject. This is okay, if the variances are in fact equal. Not
good if they aren’t.
𝑭𝒔𝒕𝒂𝒕 =
πŸ”
= 𝟏. πŸ“
πŸ’
Fail to reject H0. It would appear that the two variances are equal.
Clear? Hope so. Now try on what you’ve learned…
Sample 1: s2 = 81, n = 12; Sample 2: s2 = 100, n = 10
1. Construct a 95% confidence interval for variance 1.
2. Conduct a hypothesis test (α=0.10) on variance 1 to determine if it equals 100.
3. Conduct a hypothesis test (α=0.02) that the two variances are equal.
Try these on your own. Answers are on the succeeding page(s).
1. Construct a 95% confidence interval for variance 1.
(12 − 1)81
(12 − 1)81
≤ 𝜎2 ≤
21.92
3.8157
40.64≤ 𝜎 2 ≤ 233.5
2. Conduct a hypothesis test (α=0.10) on variance 1 to determine if it equals 100.
H0: σ2 = 100
HA: σ2 ≠ 100
χ2crit = 4.5748 and 19.6752
If χ2stat > 19.6752 or < 4.5748, reject H0. Else, fail to reject H0.
(𝑛 − 1)𝑠 2 (12 − 1)81
2
πœ’π‘ π‘‘π‘Žπ‘‘
=
=
= 8.91
𝜎2
100
Fail to reject H0. We can believe that the variance is 100.
3. Conduct a hypothesis test (α=0.02) that the two variances are equal.
H0: σ21 = σ22
HA: σ21 ≠ σ22
Since the variance for 2 is larger, we’ll use 2’s “n – 1” for our numerator df and 1’s for our
denominator.
Fcrit = 4.632
If Fstat > 4.632, reject H0. Else, fail to reject H0.
100
= 0.13
81
Fail to reject H0. We cannot reject the null that the two cities have the same variance.
πΉπ‘ π‘‘π‘Žπ‘‘ =
Download