denote average

advertisement
1
Chapter 8 – Comparing Two Treatments
Inference about Two Population Means
We want to compare the means of two populations to see whether they differ. There are two situations
to consider, as shown in the following examples:
1) In an experiment designed to study the effects of illumination level on task performance
(“Performance of Complex Tasks Under Different Levels of Illumination,” J. Illuminating
Engineering, 1976: 235-242), subjects were required to insert a fine-tipped probe into the eyeholes of
ten needles in rapid succession both for a low-light-level with a black background and for a higher
level with a white background. It is of interest to compare the mean times for completion of the task
under the two different conditions.
2) Compare the mean lifetime, 1, for transistors produced by production line 1 to the mean lifetime,
2, for transistors produced by production line 2. We want to know whether these two means differ.
In the first case, we are comparing related means, using dependent samples. For each member of one
sample, there is a matched member of the other sample.
In the second case, we are comparing unrelated means, using independent samples. There is no natural
way to match each member of one sample with a member of the other sample.
We will use somewhat different procedures for hypothesis tests, depending on whether our samples are
dependent or independent.
There is another issue to be considered. Are the variances of the two populations equal or unequal.
This issue, of course, did not arise with inference about a single population. We will see that the
procedure for inference about the difference between the means depends on the comparison of the
variances.
Comparing Two Means, Independent Samples
We will assume the following:
1) We have selected a random sample from each of the two populations. The r.s., of size n1,
from population 1 will be denoted by
i.i.d. with mean
denoted by
variance
2)
3)
1
and variance
X 21 , X 22 ,
X 11 , X 12 ,
 12 .
, X 1n1 .
These r.v.’s are assumed to be
The r.s., of size n2, from population 2 will be
, X 2 n2 . These r.v.’s are assumed to be i.i.d. with mean  2 and
 22 .
The two populations are independent. This implies that all of the n1  n2 r.v.’s listed above
are independent of each other.
Either both populations are normal, or the conditions of the Central Limit Theorem apply.
(We may also check for normality of each population using normal probability plots with the
samples of data.)
2
We want to estimate the difference, 1  2 , between the population means. A logical point estimator
of this parameter is X 1  X 2 . It is easily shown that this statistic is an unbiased estimator of the


parameter. It is also easily shown that the variance of the estimator is V X 1  X 2 
 12
n1

 22
n2
.
Given these results and the assumptions listed above, it is clear that the random variable
X
1
 X 2    1  2 
 12
n1

 22
has an approximate standard normal distribution. We want to use this
n2
fact to do inference about the difference between the two population means. However, the random
variable given above depends on two other unknown parameters. We need to estimate the two
population variances.
Testing Hypotheses About the Difference Between Two Means
Assume that we want to test whether the means of two populations, population 1 and population 2,
differ. In other words, our alternative hypothesis has one of the following forms:
Ha: 1 - 2  0
Ha: 1 - 2 < 0
Ha: 1 - 2 > 0
There are two cases to consider: Either the populations have the same variability, or they do not. The
form of the test statistic will depend on whether we can make the assumption that the populations have
equal variances.
If we can assume equal variances, then we want to use the following statistic:
t
X
1
 X 2   1   2 0
n1  1s12  n2  1s22
n1  n2  2
S
2
P
n1  1 S12   n2  1 S 22


n1  n2  2
1 1 . Here the quantity

n1 n2
is the pooled variance estimate.
If we cannot assume equal variances, then we want to use the following statistic:
t
X
1
 X 2   1   2 0
s12 s22

n1 n2
.
Under the null hypothesis, this statistic has an approximate t-distribution with degrees of freedom
given by
3
2
 S12 S 22 
  
 n1 n2 

2
2
 S12 / n1    S22 / n2 
n1  1
n2  1
In either case, our hypothesis proceeds as follows:
Step 1: State the null and alternative hypotheses.
Step 2: State the chosen sample sizes and significance level.
Step 3: State the test statistic (substituting 0 for 1 - 2), and stating the distribution of the test statistic
under the null hypothesis.
Step 4: Find the rejection region and critical value(s).
Step 5: Choose the samples, collect the data, calculate the value of the test statistic.
Step 6: If we reject the null hypothesis, the conclusion should be stated in the following form: “We
reject H0 at the () level of significance. We have sufficient evidence to conclude that (statement of
alternative hypothesis).”
If we fail to reject the null hypothesis, the conclusion should be stated in the following form: “We fail
to reject H0 at the () level of significance. We do not have sufficient evidence to conclude that
(statement of alternative hypothesis).”
In the following examples, we assume that the population variances are equal. We can also do a
simple graphical check for equality, by constructing side-by-side boxplots of the two data sets. We
could also check for normality using probability plots, if we had the data sets available. We will learn
later how to test for equality of the variances.
Example: Let and denote true average tread lives for two competing brands of size P205/65R15
radial tires. We want to test whether the average tread lives are different. We choose a random sample
of 45 tires of the first type and a random sample of 45 tires of the second type. We test each tire under
identical conditions until the tread wears out. We obtain the following data: x1  42,500 mi. ,
s1  2200 mi. , x2  40, 400 mi. , s2  1900 mi.
Example: The accompanying table gives summary data on cube compressive strength (N/mm2) for
concrete specimens made with a pulverized fuel-ash mix (“A study of twenty-five-year-old pulverized
fuel ash concrete used in foundation structures,” Proceedings of the Institute of Civil Engineers,”
Mar. 1985, 149-165). We want to test whether the true mean 7-day strength is less than the true mean
28-day strength.
Age (days)
7
28
Sample Size
68
74
Sample Mean
26.99
35.76
Sample SD
4.89
6.43
4
Confidence Intervals for Differences Between Population Means
We can find confidence interval estimates for the differences between two population means
(independent samples) using the following formulas, depending on whether the population variances
are equal or unequal:
1) For equal population variances, use
X
1
 X 2   t
2
n1  1s12  n2  1s22  1
,d . f .
n1  n2  2
1
 n  n  . In this case, d.f. = n1 + n2 – 2.
2 
 1
2) For unequal population variances, use
X
1
 X 2   t
2
,d . f .
s12 s 22

. In this case, d.f. = the smaller of the values n1 – 1 and n2 – 1.
n1 n2
Example: Estimate the difference in mean tread lives from the first example above. Interpret this
interval estimate.
Example: Estimate the difference, 7  28 , in mean compressive strengths from the second
example above. Interpret this interval estimate.
Choice of Sample Sizes, When Variances Are Equal
For the two-sided alternative hypothesis, HA: 1  2  0 , with equal sample sizes and equal
  0
, to
2
find the appropriate sample size. The sample size read from the curve will be n*  2n 1.
population variances, we may use Charts Va and Vb in the Appendix, together with d 
Tests of Hypotheses Concerning the Difference Between Two Population Means, Unequal
Variances
If we do not have reason to believe that the populations have equal variability, we should check for
equal variability by some method. One way to do this is to do a hypothesis test in which the null
hypothesis is that the population variances are equal. Another way is to graphically compare the two
data sets. We will look at the second method now, and look at testing equality of the variances later.
Example: The void volume within a textile fabric affects comfort, flammability, and insulation
properties. Permeability of a fabric refers to the accessibility of void spaces to the flow of a gas or
liquid. The paper “The relationship between porosity and air permeability of woven textile fabrics”
(Journal of Testing and Evaluation, 1997: 108-114) gave summary information on air permeability
(cm3/cm2/sec) for a number of different fabric types. Consider the following data on two different
types of plain-weave fabric:
Fabric Type
Cotton
Triacetate
Sample Size
10
10
Sample Mean
51.71
136.14
Sample SD
0.79
3.59
We want to test whether plain-weave triacetate has a higher mean permeability than plain-weave
cotton. We also want a 95% confidence interval estimate of the difference between the means.
5
Tests of Hypotheses Concerning the Difference Between Two Population Means, Dependent
Samples
When there is a natural pairing between each member of one population and a member of the other
population, the test for differences between the population means must be done somewhat differently.
For dependent samples, inferences are performed based on the differences between the scores for each
pair.
Let X1, X2, X3, …, Xn be the observations made on the members of the first sample, and let Y1, Y2, Y3,
…, Yn be the observations made on members of the second sample. The random variables we will use
in this test will be D1 = X1 – Y1,
D2 = X2 – Y2, D3 = X3 – Y3, …, Dn = Xn – Yn . Using the set of difference random variables, we
n
conduct a one-sample t-test. The sample mean is then D 
 D
n
S D2 
i 1
i
D
i 1
n
i
, and the sample variance is
 D
n 1
2
.
We may either use the raw data of difference scores to conduct our t-test, or we may use these sample
statistics based on the difference scores.
The alternative hypotheses have one of the following forms:
Ha: 1 - 2  0
or
Ha: D  0
Ha: 1 - 2 < 0
or
Ha: D < 0
Ha: 1 - 2 > 0
or
Ha: D > 0
Here
 D  1   2 , the difference between the population means.
The steps in the hypothesis test are similar to those in previous tests:
Step 1: State the two hypotheses to be tested.
Step 2: State the sample size (note that both samples must have the same size), and the chosen
significance level.
Step 3: The test statistic is t 
d   D0
, which under the null hypothesis has a t distribution with d.f.
sD
n
= n – 1.
Step 4: Find the rejection region and critical value(s).
Step 5: Choose samples, collect data, calculate the value of the test statistic.
Step 6: State the conclusion, in terms of the alternative hypothesis, and being sure to include the
significance level of the test.
Confidence Intervals for Differences Between Related Population Means
If we want to obtain an interval estimate of the difference between two population means, when there
is a pairwise relationship between members of one population and members of the other, we first
compute the difference scores
6
Note that, if the samples are either from normal distributions or are large enough that we may use the
Central Limit Theorem, then
D  D
T
has an (approximate) t distribution with d.f. = n – 1. Then a (1 – α)100% confidence
 SD 


 n

SD
SD 

.
D

t
,
D

t
interval estimate for  D is 



n 1,
n 1,
n
n
2
2


Example: In an experiment designed to study the effects of illumination level on task performance
(“Performance of Complex Tasks Under Different Levels of Illumination,” J. Illuminating
Engineering, 1976: 235-242), subjects were required to insert a fine-tipped probe into the eyeholes of
ten needles in rapid succession both for a low-light-level with a black background and for a higher
level with a white background. It is of interest to compare the mean times for completion of the task
under the two different conditions. The data are given in the table below. We want to test whether the
higher level of illumination yields a decrease of more than 5 seconds in true mean task completion
time.
Subject
1
2
3
4
5
6
7
8
9
Black
25.85
28.84
32.05
25.74
20.89
41.05
25.01
24.96
27.47
White
18.23
20.84
22.96
19.68
19.50
24.98
16.61
16.07
24.59
We also want a 95% confidence interval estimate of the difference between the mean times to
complete the task under low illumination v. high illumination.
Download