Hypothesis Tests for Comparing Two Populations

Hypothesis Tests for Comparing Two Population Means
In some applications, it is natural to consider the problem as comparing
population means. If one knew the means (say 1 and 2), then a direct comparison could
be made by computing (1 - 2); however, this is typically not the case!
Suppose X1 and X2 represent the sample means based on samples selected from
populations 1 and 2. What is E( X1 ) ? E( X2 ) ?
What is E( X1  X2 )  E( X1 )  E( X2 )?
In a manner similar to our approach to hypothesis testing for , we can use ( X1  X2 ) to
obtain information about 1 - 2 and test hypotheses about 1 - 2.
Example: A psychologist wishes to conduct an investigation to determine whether or not
car ownership is detrimental to academic achievement for undergraduates.
How could we conduct such a study? What data could we collect?
What might be an appropriate measure of academic achievement?
What population(s) are we interested in?
What are the parameter(s) that we would like to investigate?
Given the aim of the investigation, what hypotheses should we form?
The random variable ( X1  X2 ) has an approximate normal distribution under
repeated sampling. Thus, we can form the test statistic
z 
( x1  x 2 )  (  1   2 )
 12
 22
to test our hypotheses (assuming that 12 and 22 are known and n1 and n2 ae sufficiently
large to justify the normality of ( X1  X2 ) . When is this so?
Assuming  = .05, our decision rule can be written as:
Now suppose that we take a random sample of 100 students that own cars and 100
students that do not own cars. The grade point average for the n1 = 100 non-car owners
possessed an average of x1 = 2.7, as opposed to x 2 = 2.54 for the n2 = 100 car owners.
Assume that 12 = 0.36 and 22 = 0.40.
What is the value of z?
Our conclusion?
What is the p-value?
Note: If 12 and 22 are not known, the values of s12 and s22 may be substituted into our
computation of z as long as n1 and n2 are reasonably large ( 30).
Example 2:
Company X has been a supplier to Company Y for more than 20 years. Ten years
ago, X introduced a new product that was subsequently found to contain major defects. Y
now claims that the introduction of this product has caused a reduction in sales because of
customer dissatisfaction and loss of goodwill. In order to test this claim, the sales for 60
months prior to the introduction of the product were sampled and averaged, yielding x1 =
58 (units are $100,000). Sales for 36 months after the introduction of the product found
x 2 = 42. If it is accepted that 12 = 22 = 400, is there evidence at the 1% level to support
Y’s claim?
Decision rule:
Test statistic: