Statistics 2014, Fall 2001

advertisement
1
Chapter 9 – Inferences Concerning Variances
Estimation of Variances
Recall that for a random sample of size n from (any) population distribution, the sample variance is
defined as
𝑛
1
2
𝑆 =
∑(𝑋𝑖 − 𝑋̅)2 .
𝑛−1
𝑖=1
Here, we are considering 𝑋1 , 𝑋2 , … , 𝑋𝑛 , 𝑎𝑛𝑑 𝑋̅ to be random variables (whose values depend on the
particular random sample we will happen to select from the population). Therefore, 𝑆 2 is also a
random variable.
It is always the case that the sample variance is an unbiased estimator of the population variance, 𝜎 2
(proving this requires some messy algebra).
When we discussed inference about the difference between the means of two independent populations,
there were two cases to consider – either we could assume that the two populations had equal
variances, or we could not make such an assumption. We want to be able to test whether the two
population variances are unequal. In other words, we want to test the two hypotheses
H0 :
 12   22
v.
Ha:  1   2 .
2
2
Defn: Let X1, X2, …, Xn be a random sample from a distribution which is normal with mean µ and
2
1 n
2
2

variance
. The sample variance is defined as S 
 X i  X  . The random variable
n  1 i 1
 X
n
X2 
n  1S

2
2

i 1
 X
2
i
2
has a chi-square distribution with
d.f. = n – 1. The p.d.f. for a distribution which is chi-square with k degrees of freedom is given by
f y 
1
k
2  
2
k
2
y
k
y
1 
2
2
e , for y > 0. (Note that this is just a gamma distribution with   k and β =
2
2.)
The mean of a chi-square(k) distribution is k. The variance of the distribution is 2k
As I stated earlier in the course, in analysis of experimental data, our decision is based on a statistic
which has the form of a “signal-to-noise” ratio. Here we introduce that statistic. It will also be used
later.
Defn: Let W and Y be independent chi-square random variables with u and v degrees of freedom,
respectively. Then the random variable F 
W / u 
Y / v 
has an F distribution with numerator degrees of
freedom u and denominator degrees of freedom v. The p.d.f. of this distribution is
2
u v


2 

f y 
u v
    
2 2
u
 u  2 2 1
  y
v
 u
1  v

u

y

u v
2
provided v > 2. The variance of an
Let X11, X12, …,
variance
 12 .
, for y > 0. The mean of an
Fu ,v
distribution is  2 
Fu ,v
distribution is  
v
,
v2
2v 2 u  v  2
, provided v > 4.
2
u v  2 v  4
X 1n1 be a random sample from a distribution which is normal with mean µ1 and
Let X21, X22, …,
mean µ2 and variance
 22 .
X 2n2 be a random sample from a distribution which is normal with
Then the random variable F 
numerator degrees of freedom
S
S
2
1
2
2


/  12
has an F distribution with
/  22
u  n1  1 , and denominator degrees of freedom v  n2  1 .
Testing Hypotheses About the Equality of Variances
We will assume that we have two independent random samples from two normal distributions, the first
having variance
 12 , and the second having variance  22 .
We want to test whether the two variances
2
1
2
2
are equal. The test statistic to be used is F  S . Under the null hypothesis, this statistic has an F
S
distribution with numerator d.f. = n1 – 1, and denominator d.f. = n2 – 1.
Example: The void volume within a textile fabric affects comfort, flammability, and insulation
properties. Permeability of a fabric refers to the accessibility of void spaces to the flow of a gas or
liquid. The paper “The relationship between porosity and air permeability of woven textile fabrics”
(Journal of Testing and Evaluation, 1997: 108-114) gave summary information on air permeability
(cm3/cm2/sec) for a number of different fabric types. Consider the following data on two different
types of plain-weave fabric:
Fabric Type
Cotton
Triacetate
Sample Size
10
10
Sample Mean
51.71
136.14
Sample SD
0.79
3.59
We want to test whether plain-weave triacetate has a higher mean permeability than plain-weave
cotton. However, to do this test, we need to check the assumption of equal population variances, so
that we know which test statistic to use to compare the means. (Since we have small samples, there is
an additional assumption that needs to be checked, the assumption of normality. However, since we
do not have the raw data for this example, we cannot do normal probability plots.)
Download