1 Chapter 9 – Inferences Concerning Variances Estimation of Variances Recall that for a random sample of size n from (any) population distribution, the sample variance is defined as 𝑛 1 2 𝑆 = ∑(𝑋𝑖 − 𝑋̅)2 . 𝑛−1 𝑖=1 Here, we are considering 𝑋1 , 𝑋2 , … , 𝑋𝑛 , 𝑎𝑛𝑑 𝑋̅ to be random variables (whose values depend on the particular random sample we will happen to select from the population). Therefore, 𝑆 2 is also a random variable. It is always the case that the sample variance is an unbiased estimator of the population variance, 𝜎 2 (proving this requires some messy algebra). When we discussed inference about the difference between the means of two independent populations, there were two cases to consider – either we could assume that the two populations had equal variances, or we could not make such an assumption. We want to be able to test whether the two population variances are unequal. In other words, we want to test the two hypotheses H0 : 12 22 v. Ha: 1 2 . 2 2 Defn: Let X1, X2, …, Xn be a random sample from a distribution which is normal with mean µ and 2 1 n 2 2 variance . The sample variance is defined as S X i X . The random variable n 1 i 1 X n X2 n 1S 2 2 i 1 X 2 i 2 has a chi-square distribution with d.f. = n – 1. The p.d.f. for a distribution which is chi-square with k degrees of freedom is given by f y 1 k 2 2 k 2 y k y 1 2 2 e , for y > 0. (Note that this is just a gamma distribution with k and β = 2 2.) The mean of a chi-square(k) distribution is k. The variance of the distribution is 2k As I stated earlier in the course, in analysis of experimental data, our decision is based on a statistic which has the form of a “signal-to-noise” ratio. Here we introduce that statistic. It will also be used later. Defn: Let W and Y be independent chi-square random variables with u and v degrees of freedom, respectively. Then the random variable F W / u Y / v has an F distribution with numerator degrees of freedom u and denominator degrees of freedom v. The p.d.f. of this distribution is 2 u v 2 f y u v 2 2 u u 2 2 1 y v u 1 v u y u v 2 provided v > 2. The variance of an Let X11, X12, …, variance 12 . , for y > 0. The mean of an Fu ,v distribution is 2 Fu ,v distribution is v , v2 2v 2 u v 2 , provided v > 4. 2 u v 2 v 4 X 1n1 be a random sample from a distribution which is normal with mean µ1 and Let X21, X22, …, mean µ2 and variance 22 . X 2n2 be a random sample from a distribution which is normal with Then the random variable F numerator degrees of freedom S S 2 1 2 2 / 12 has an F distribution with / 22 u n1 1 , and denominator degrees of freedom v n2 1 . Testing Hypotheses About the Equality of Variances We will assume that we have two independent random samples from two normal distributions, the first having variance 12 , and the second having variance 22 . We want to test whether the two variances 2 1 2 2 are equal. The test statistic to be used is F S . Under the null hypothesis, this statistic has an F S distribution with numerator d.f. = n1 – 1, and denominator d.f. = n2 – 1. Example: The void volume within a textile fabric affects comfort, flammability, and insulation properties. Permeability of a fabric refers to the accessibility of void spaces to the flow of a gas or liquid. The paper “The relationship between porosity and air permeability of woven textile fabrics” (Journal of Testing and Evaluation, 1997: 108-114) gave summary information on air permeability (cm3/cm2/sec) for a number of different fabric types. Consider the following data on two different types of plain-weave fabric: Fabric Type Cotton Triacetate Sample Size 10 10 Sample Mean 51.71 136.14 Sample SD 0.79 3.59 We want to test whether plain-weave triacetate has a higher mean permeability than plain-weave cotton. However, to do this test, we need to check the assumption of equal population variances, so that we know which test statistic to use to compare the means. (Since we have small samples, there is an additional assumption that needs to be checked, the assumption of normality. However, since we do not have the raw data for this example, we cannot do normal probability plots.)