TESTING WITH VARIANCES As mentioned in class, the variance presents a new problem. We’d like to be able to directly test the standard deviation, however that is not possible so we test its square. This fact alone causes issues with our previous Z and t distributions. Since the variance can never be negative, we are required to use a new distribution. For testing single variances, we will use the square of the normal distribution which is called a chi-square (χ2) distribution. Let’s start with a confidence interval. With a confidence interval around a mean, we had a “±” term. The variance is slightly different. (π − 1)π 2 (π − 1)π 2 2 ≤π ≤ 2 2 ππππππ ππΏππ€ππ The “n – 1” and “s2” are exactly what you think they are; they are the sample size – 1 and sample variance respectively. The new term you’re looking at is the denominators. We’ll need to get these from the table. To show you how this works, let’s do a simple confidence interval. Suppose you have a sample variance of 4 based on a sample of 8 and you want to do a 90% confidence interval. We know right off the bat that n = 8 (and thus n – 1 = 7) and s2 = 4. To figure out the denominator values, we call upon the fact that we are looking for a 90% interval. This means that our picture would look like this: Look familiar? It’s the same general idea as the confidence interval around means; the only difference is that the shape of the distribution is different. This brings up another issue; the χ2 is not symmetric. We cannot simply find one number and use its negative counterpart. So let’s focus on finding the values for the denominators. First of all, we know immediately that the sum of the red areas is 0.10 (or at least you should by now). This tells us that each red area on its own has area of 0.05. This is where we’ll use the table. The table in your text gives us the area to the right of certain values. So what row should we be using for our particular problem? 0.05 and 0.95. Here’s why: We know the table gives us area to the right. We know that the red area to the right of the line that “chops” the right tail is 0.05. So this is exactly what the table gives us. But why 0.95? Take a gander at this picture. We are concerned with the number that corresponds to the boundary between the grey and red areas in the picture you see above. We know the far right red area (the right tail) is 0.05 (see above). We also know that the newly shaded red area is 0.90. The combination of the two is 0.95; this gives us the value we need. Oh, one more thing. You’re going to need degrees of freedom again. Have any guesses? Look at the equation…there it is. Your degrees of freedom is (n – 1) located in the equation (as usual). So now we can get the values we need. They are 2.1673 and 14.0671 (df = 7). Now simply plug those into the equation you saw earlier and you get: (8 − 1)4 (8 − 1)4 ≤ π2 ≤ 14.0671 2.1673 So we are 90% confident that our variance is between 1.99 and 12.919. See later pages for another example. Now let’s try running a hypothesis test on our single variance. We’ll use the same data (s2 = 4, n = 8). Try this one: Conduct a hypothesis test (α=0.02) that the population variance is equal to 7. As usual, we have to complete the full six steps. Fortunately, they are not all that different in the result that those you learned for the single mean test. Here we go. Step 1) You should at least know that there will be an H0 and an HA. But, notice our claim is that the variance is 7. Just like before, that’s what goes in the null and the complement of that is the alternative. H0: σ2 = 7, HA: σ2 ≠ 7 Step 2) Now we need our critical value. For a single variance we are using the χ2 table just like with the confidence intervals. The fact that your alpha level is 0.02 means that each tail contains 0.01 of area. That indicates that the two “rows” you’re interested in are 0.01 and 0.99 respectively. With your n – 1 degrees of freedom (n – 1 = 7), you find critical values of 1.2390 and 18.4753. χ2crit = 1.2390 and 18.4753 Step 3) All you do (again) is state in words what the graph would show. If χ2stat > 18.4753 or < 1.2390, reject H0. Else, fail to reject H0. 2 Step 4) Time to calculate your test statistic. The equation for this is: ππ π‘ππ‘ = (π−1)π 2 . π2 Just like with the mean tests, you’re getting the number to replace your Greek letter from the hypotheses. ππππππ = (π − π)π =π π Steps 5 and 6) From here on in, you know what to do. Fail to reject H0. We have no reason to believe that the variance is different than 7. See later pages for another example. Now it’s time for the fun one. We can actually test that two variances are equal! You’ll see in a bit that the test statistic (step 4) involves dividing two variances, so we are effectively dividing two χ2 distributions. When you do that, your result is a new distribution called the F distribution (and thank goodness, I was frankly tired of continuing to have to locate the Greek letter chi). There are a few things to make you aware of about the F-tables in your book. First, F-tables require two degrees of freedom; one is for the denominator and the other for the numerator. This should come somewhat intuitively as I’ve already mentioned that we will be using division in step 4. Second, your tables only give you the values on the right tail. If we are conducting tests that are looking for equality of variances, they have to be two sided (just like all the others we’ve encountered). How we deal with this I will discuss in step 2. Last and certainly not least, you have a separate F-table for each alpha level. This means that when you are using the F-table to find your critical value, the first step is to choose the proper table. If you are doing a one-sided test, simply find the alpha level that matches that of the problem. If you are doing a two-sided test, find the table that matches half of your given alpha level. So for a two-sided test with α=0.05, the table you choose is 0.025. So, let’s take off and test a couple of variances. Conduct a hypothesis test (α=0.02) that the following two variances are equal: s21 = 4, n1 = 8, s22 = 6, n2 = 6. Step 1) Think, what are we testing? Got it? So write it! H0: σ21 = σ22, HA: σ21 ≠ σ22 Step 2) I mentioned earlier that we need a numerator and denominator degrees of freedom. In this test, both df are n – 1. Ah…but which is the numerator and which denominator? Here’s where we’re going to fix one of those problems with the F-table. Always put the larger variance in the numerator. What this means is that our s22 will be the numerator. The sample size follows the variance, so numerator df = 6 – 1 and denominator df = 8 – 1. Since our alpha level of 0.02 tells us we look for the table labeled 0.01, we’re ready. Fcrit = 7.460 Step 3) Yes, I’m making you do this yet again. If Fstat > 7.460, reject H0. Else, fail to reject H0. Step 4) Now is where you need to remember that the larger variance goes on top. The test statistic is: πΉπ π‘ππ‘ = π π2 , π€βπππ π π2 > π π2 π π2 If you do it backwards, you will always fail to reject. This is okay, if the variances are in fact equal. Not good if they aren’t. πππππ = π = π. π π Fail to reject H0. It would appear that the two variances are equal. Clear? Hope so. Now try on what you’ve learned… Sample 1: s2 = 81, n = 12; Sample 2: s2 = 100, n = 10 1. Construct a 95% confidence interval for variance 1. 2. Conduct a hypothesis test (α=0.10) on variance 1 to determine if it equals 100. 3. Conduct a hypothesis test (α=0.02) that the two variances are equal. Try these on your own. Answers are on the succeeding page(s). 1. Construct a 95% confidence interval for variance 1. (12 − 1)81 (12 − 1)81 ≤ π2 ≤ 21.92 3.8157 40.64≤ π 2 ≤ 233.5 2. Conduct a hypothesis test (α=0.10) on variance 1 to determine if it equals 100. H0: σ2 = 100 HA: σ2 ≠ 100 χ2crit = 4.5748 and 19.6752 If χ2stat > 19.6752 or < 4.5748, reject H0. Else, fail to reject H0. (π − 1)π 2 (12 − 1)81 2 ππ π‘ππ‘ = = = 8.91 π2 100 Fail to reject H0. We can believe that the variance is 100. 3. Conduct a hypothesis test (α=0.02) that the two variances are equal. H0: σ21 = σ22 HA: σ21 ≠ σ22 Since the variance for 2 is larger, we’ll use 2’s “n – 1” for our numerator df and 1’s for our denominator. Fcrit = 4.632 If Fstat > 4.632, reject H0. Else, fail to reject H0. 100 = 0.13 81 Fail to reject H0. We cannot reject the null that the two cities have the same variance. πΉπ π‘ππ‘ =