Hypothesis Tests for Comparing Two Population Means In some applications, it is natural to consider the problem as comparing population means. If one knew the means (say 1 and 2), then a direct comparison could be made by computing (1 - 2); however, this is typically not the case! Suppose X1 and X2 represent the sample means based on samples selected from populations 1 and 2. What is E( X1 ) ? E( X2 ) ? What is E( X1 X2 ) E( X1 ) E( X2 )? In a manner similar to our approach to hypothesis testing for , we can use ( X1 X2 ) to obtain information about 1 - 2 and test hypotheses about 1 - 2. Example: A psychologist wishes to conduct an investigation to determine whether or not car ownership is detrimental to academic achievement for undergraduates. How could we conduct such a study? What data could we collect? What might be an appropriate measure of academic achievement? 1 What population(s) are we interested in? What are the parameter(s) that we would like to investigate? Given the aim of the investigation, what hypotheses should we form? H0: H1: The random variable ( X1 X2 ) has an approximate normal distribution under repeated sampling. Thus, we can form the test statistic z ( x1 x 2 ) ( 1 2 ) 12 n1 22 n2 to test our hypotheses (assuming that 12 and 22 are known and n1 and n2 ae sufficiently large to justify the normality of ( X1 X2 ) . When is this so? Assuming = .05, our decision rule can be written as: Now suppose that we take a random sample of 100 students that own cars and 100 students that do not own cars. The grade point average for the n1 = 100 non-car owners possessed an average of x1 = 2.7, as opposed to x 2 = 2.54 for the n2 = 100 car owners. Assume that 12 = 0.36 and 22 = 0.40. 2 What is the value of z? Our conclusion? What is the p-value? Note: If 12 and 22 are not known, the values of s12 and s22 may be substituted into our computation of z as long as n1 and n2 are reasonably large ( 30). Example 2: Company X has been a supplier to Company Y for more than 20 years. Ten years ago, X introduced a new product that was subsequently found to contain major defects. Y now claims that the introduction of this product has caused a reduction in sales because of customer dissatisfaction and loss of goodwill. In order to test this claim, the sales for 60 months prior to the introduction of the product were sampled and averaged, yielding x1 = 58 (units are $100,000). Sales for 36 months after the introduction of the product found x 2 = 42. If it is accepted that 12 = 22 = 400, is there evidence at the 1% level to support Y’s claim? Hypotheses: H0: H1: 3 Decision rule: Test statistic: Conclusion: 4