Hypothesis Tests for Variances -- The Chi

advertisement
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
Hypothesis Tests for Variances
Sometimes we want to test hypotheses about the spread, or variance, of a variable’s values, instead of testing
hypotheses about the mean value of the variable. In this handout we will consider two tests involving variances.
In the first test, we use a variance from a sample to test a hypothesis about the variance of the underlying
population. In the second test, we use the variances from two samples, each of which was taken from a different
population, to test hypotheses about possible differences in the variances of the two populations. These tests
apply to numerical, measurement variables.
Using a Sample Variance to Test a Hypothesis about a Population Variance
Suppose there is a population of individuals with characteristic X, a numerical measurement variable. Suppose
further that data on X are not available for the full population, but instead we have only data from a random
sample of individuals.
μ = population mean (unknown)
σ2 = population variance (unknown)
n = sample size
Xbar = sample mean
s2 = sample variance
Suppose there is a claim that the population variance σ2 is equal to given value “a.” We can use the sample data
to test several hypotheses about this claim.
H0: σ2 = a
H1: σ2 > a ===> one-sided test
H0: σ2 = a
H1: σ2 < a ===> one-sided test
H0: σ2 = a
H1: σ2 ≠ a ===> two-sided test
We use the Chi-square frequency distribution, or “χ 2-distribution”, to test these hypotheses. The χ 2-distribution
is shown below:
probability
0
χ2
The χ2-distribution is non-negative (begins at zero and extends to the right), asymmetric, skewed to the right (that
is, with a tail that extends to the right).
1
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
All three hypothesis tests involve comparing a χ 2test number with a χ2critical number (similar to a t-test in which we
compare a ttest number with a tcritical number).
For all three hypotheses tests outlined above, the formula for calculating χ 2test is:
𝑠2
𝜎2
(Note: The hypothesized value “a” is substituted for σ2 in the χ 2test formula.)
2
𝜒𝑡𝑒𝑠𝑡
= (𝑛 − 1) ∙
The value of χ2critical comes from the χ2-table (which is based on the χ2-distribution). The χ2-table (hard copy
distributed in class) gives the χ2critical numbers for various α-values and degrees of freedom (d.f.).
The method for finding the value of χ2critical depends on which hypothesis test is being conducted, as described
below.
H 0 : σ2 = a
H1: σ2 > a ===> one-sided test
For this test, degrees of freedom = d.f. = n – 1. We find χ2critical from the χ2-table based on d.f. = n – 1 and the
Significance Level “α” chosen for the test.
probability
α-value area = blue area
to the right of χ2critical
p-value area = red area
to the right of χ2test
0
χ2critical
χ2test
For this test:
If χ 2test > χ2critical ===> Reject H0 and Accept H1
or
If p-value area < α-value area ===> Reject H0 and Accept H1
χ2
2
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
H 0 : σ2 = a
H1: σ2 < a ===> one-sided test
For this test, again degrees of freedom = d.f. = n – 1. We find χ2critical from the χ2-table based on d.f. = n – 1 and
the Significance Level “α” chosen for the test. For this test, the α-value area lies to the left of χ2critical. However,
the χ2-table gives the value of χ2critical based on the area to the right of χ2critical; this area is “(1- α).” For this reason,
we must use the value of (1-α) instead of α when looking up the value of χ2critical in the χ2-table.
probability
p-value area =
red area to the
left of χ2test
α-value area =
blue area to the
left of χ2critical
(1-α)
area
2
2
0 χ test χ critical
χ2
For this test:
If χ 2test < χ2critical ===> Reject H0 and Accept H1
or
If p-value area < α-value area ===> Reject H0 and Accept H1
H 0 : σ2 = a
H1: σ2 ≠ a ===> two-sided test
For this test, yet again degrees of freedom = d.f. = n – 1. For this two-sided test, there are two χ2critical values, with
half of the α-value area lying beyond each χ2critical value (see graph below). We find both χ2critical values from the
χ2-table based on d.f. = n – 1 and the Significance Level “α” chosen for the test. When looking up the values of
χ2critical in the χ2-table, we use α/2 for the right-side χ2critical; however, for the left-side χ2critical we must use [(1(α/2)], because the χ2-table gives the value of χ2critical based on the area to the right of χ2critical.
probability
α/2 area
α/2
area
0
For this test:
or
χ2critical
left-side
χ2critical
right-side
If χ 2test < χ2critical left-side or χ 2test > χ2critical right-side
χ2
===> Reject H0 and Accept H1
If p-value area < α/2-value area ===> Reject H0 and Accept H1
3
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
Using Two Sample Variances to Compare Two Population Variances
Consider now a situation in which there are two populations. A random sample has been drawn from each
population, and the sample variance has been calculated for variable X for each sample.
μ1 = population 1 mean (unknown)
σ21 = population 1 variance (unknown)
μ2 = population 2 mean (unknown)
σ22 = population 2 variance (unknown)
n1 = sample size
Xbar1 = sample mean
s21 = sample variance
n2 = sample size
Xbar2 = sample mean
s22 = sample variance
We can use the sample data to test two hypotheses about the relative sizes of σ21 and σ22 :
H0: σ21 = σ22
H1: σ21 > σ22===> one-sided test
H0: σ21 = σ22
H1: σ21 ≠ σ22===> two-sided test
We use the F-distribution to test these hypotheses.
The F-distribution is similar in shape to the χ 2-distribution and is shown below:
probability
0
F
Similar to the the χ -distribution, the F-distribution is non-negative (begins at zero and extends to the right),
asymmetric, and skewed to the right (that is, with a tail that extends to the right).
2
The two hypothesis tests involve comparing a Ftest number with a Fcritical number (similar to a t-test in which we
compare a ttest number with a tcritical number). For the two types of hypothesis test outlined above, the formula for
calculating Ftest is:
𝐹𝑡𝑒𝑠𝑡 =
𝑠12
𝑠22
The value of Fcritical comes from the F-table (which is based on the F-distribution). The F-table (hard copy
distributed in class) gives the Fcritical numbers for various α-values and degrees of freedom (d.f.).
The F-table requires two degrees of freedom (d.f.) numbers, one for the s12 in the numerator of the Ftest
formula, and one for the denominator of the Ftest formula. Creatively, econometricians have named these two d.f.
numbers d.f.numerator and d.f.denominator. For the hypothesis tests outlined above,
d.f.numerator = n1 – 1
and
d.f.denominator = n2 – 1.
4
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
The method for finding the value of Fcritical depends on which hypothesis test is being conducted, as described
below.
H0: σ21 = σ22
H1: σ21 > σ22===> one-sided test
Note: If you wish to test the hypotheses:
H0: σ21 = σ22
H1: σ21 < σ22===> one-sided test
simply switch the labels of populations 1 and 2, and
then use the test procedure below.
For this test, we find Fcritical from the F-table based on d.f.numerator, d.f.denominator, and the Significance Level “α”
chosen for the test. The d.f.numerator number indicates the relevant column of the F-table, the d.f.denominator number
indicates the relevant row of the F-table, and the Significance Level “α” indicates the relevant sub-row of the Ftable.
α-value area = blue area
probability
to the right of Fcritical
p-value area = red area
to the right of Ftest
Fcritical
0
For this test:
or
χF2
Ftest
If Ftest > Fcritical ===> Reject H0 and Accept H1
If p-value area < α-value area ===> Reject H0 and Accept H1
H0: σ21 = σ22
H1: σ21 ≠ σ22===> two-sided test
For this two-sided test, there are two Fcritical values, with half of the α-value area lying beyond each Fcritical value
(see graph below). We find both Fcritical values from the F-table based on d.f.numerator, d.f.denominator, and the
Significance Level “α” chosen for the test. When looking up the values of Fcritical in the χ2-table, we use α/2 for
the right-side Fcritical; however, to find the left-side Fcritical in the F-table we must use [(1-(α/2)], because the Ftable gives the value of Fcritical based on the area to the right of Fcritical.
probability
α/2 area
α/2
area
0
For this test:
or
Fcritical
Fcritical
left-side
right-side
F
If Ftest < Fcritical left-side or Ftest > Fcritical right-side ===> Reject H0 and Accept H1
If p-value area < α/2-value area ===> Reject H0 and Accept H1
5
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
Example: Using a Sample Variance to Test a Hypothesis about a Population Variance
Suppose someone claims that the variance of X in a population is equal to 9, that is, σ2 = 9. To test this
hypothesis, a random sample of size n = 31 is collected, and the sample variance is s2 = 12. Because the sample
variance is larger than the claimed value, we decide to run the following hypothesis test:
H0: σ2 = 9
H1: σ2 > 9 ===> one-sided test
𝑠2
2
𝜒𝑡𝑒𝑠𝑡
= (𝑛 − 1) ∙ 𝜎2 = (31 − 1) ∙
12
9
= 40
In this example, d.f. = 31 – 1 = 30.
We decide to use α = 0.05 for the test.
From the χ2-table we find χ2critical = 43.77.
Because χ2test < χ2critical, we cannot reject H0. So, we conclude that the claim σ2 = 9 cannot be rejected.
Example: Using Two Sample Variances to Compare Two Population Variances
Suppose we are comparing the variances in the SAT math scores between a population of males and a population
of females. We have a random sample from each population, as described below.
σ21 = population 1 (males) variance (unknown)
σ22 = population 2 (females) variance (unknown)
n1 = sample size = 23
s21 = sample variance = 83.88
n2 = sample size = 24
s22 = sample variance = 46.61
Suppose we wish to test the following hypothesis with a Confidence Level of 99%.
H0: σ21 = σ22
H1: σ21 > σ22===> one-sided test
𝑠2
83.88
𝐹𝑡𝑒𝑠𝑡 = 𝑠12 = 46.61 = 1.80
2
α = 0.01
d.f.numerator = n1 – 1 = 22
d.f.denominator = n2 – 1 = 23
====>
Fcritical = 2.94
For this test:
If Ftest < Fcritical ===> Do not Reject H0.
Conclude: the variance in male SAT scores is similar to the variance in female SAT scores.
6
Download